The perfect PHP clean url generator

In my hunt for the perfect clean url (smart url, slug, permalink, whatever) generator I’ve always slipped in some exception or bug that made the function a piece of junk. But I recently found an easy solution I hope I could call “definitive”.

Clean url generators are crucial for search engine optimization or just to tidy up the site navigation. They are even more important if you work with international characters, accented vowels /à, è, ì, .../, cedilla /ç/, dieresis /ë/, tilde /ñ/ and so on.

First of all we need to strip all special characters and punctuation away. This is easily accomplished with something like:

function toAscii($str) {
	$clean = preg_replace("/[^a-zA-Z0-9/_|+ -]/", '', $str);
	$clean = strtolower(trim($clean, '-'));
	$clean = preg_replace("/[/_|+ -]+/", '-', $clean);

	return $clean;
}

With our toAscii function we can convert a string like “Hi! I’m the title of your page!” to hi-im-the-title-of-your-page. This is nice, but what happens with a title like “A piñata is a paper container filled with candy”?

The result will be a-piata-is-a-paper-container-filled-with-candy, which is not cool. We need to convert all special characters to the closest ascii character equivalent.

There are many ways to do this, maybe the easiest is by using iconv.

setlocale(LC_ALL, 'en_US.UTF8');
function toAscii($str) {
	$clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
	$clean = preg_replace("/[^a-zA-Z0-9/_| -]/", '', $clean);
	$clean = strtolower(trim($clean, '-'));
	$clean = preg_replace("/[/_| -]+/", '-', $clean);

	return $clean;
}

I always work with UTF-8 but you can obviously use any character encoding recognized by your system. The piñata text is now transliterated into a-pinata-is-a-paper-container-filled-with-candy. Lovable.

If they are not Spanish, users will hardly search your site for the word piñata, they will most likely search for pinata. So you may want to store both versions in your database. You may have a title field with the actual displayed text and a slug field containing its ascii version counterpart.

We can add a delimiter parameter to our function so we can use it to generate both clean urls and slugs (in newspaper editing, a slug is a short name given to an article that is in production, source).

setlocale(LC_ALL, 'en_US.UTF8');
function toAscii($str, $delimiter='-') {
	$clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
	$clean = preg_replace("/[^a-zA-Z0-9/_|+ -]/", '', $clean);
	$clean = strtolower(trim($clean, '-'));
	$clean = preg_replace("/[/_|+ -]+/", $delimiter, $clean);

	return $clean;
}

// echo toAscii("A piñata is a paper container filled with candy.", ' ');
// returns: a pinata is a paper container filled with candy

There’s one more thing. The string “I’ll be back!” is converted to ill-be-back. This may or may not be an issue depending on your application. If you use the function to generate a searchable slug for example, looking for “ill” would return the famous Terminator quote that probably isn’t what you wanted.

setlocale(LC_ALL, 'en_US.UTF8');
function toAscii($str, $replace=array(), $delimiter='-') {
	if( !empty($replace) ) {
		$str = str_replace((array)$replace, ' ', $str);
	}

	$clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
	$clean = preg_replace("/[^a-zA-Z0-9/_|+ -]/", '', $clean);
	$clean = strtolower(trim($clean, '-'));
	$clean = preg_replace("/[/_|+ -]+/", $delimiter, $clean);

	return $clean;
}

You can now pass custom delimiters to the function. Calling toAscii("I'll be back!", "'") you’ll get i-ll-be-back. Also note that the apostrophe is replaced before the string is converted to ascii as character encoding conversion may lead to weird results, for example é is converted to 'e, so the apostrophe needs to be parsed before the string is mangled by iconv.

The function seems now complete. Lets stress test it.

echo toAscii("Mess'd up --text-- just (to) stress /test/ ?our! `little` clean url fun.ction!?-->");
returns: messd-up-text-just-to-stress-test-our-little-clean-url-function

echo toAscii("Perché l'erba è verde?", "'"); // Italian
returns: perche-l-erba-e-verde

echo toAscii("Peux-tu m'aider s'il te plaît?", "'"); // French
returns: peux-tu-m-aider-s-il-te-plait

echo toAscii("Tänk efter nu – förr'n vi föser dig bort"); // Swedish
returns: tank-efter-nu-forrn-vi-foser-dig-bort

echo toAscii("ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝßàáâãäåæçèéêëìíîïðñòóôõöùúûüýÿ");
returns: aaaaaaaeceeeeiiiidnooooouuuuyssaaaaaaaeceeeeiiiidnooooouuuuyy

echo toAscii("Custom`delimiter*example", array('*', '`'));
returns: custom-delimiter-example

echo toAscii("My+Last_Crazy|delimiter/example", '', ' ');
returns: my last crazy delimiter example

I’m sure we are far from perfection and probably some php/regex guru will soon bury me under my ignorance suggesting an über-simple alternative to my function. What do you thing?

89 thoughts on “The perfect PHP clean url generator”

  1. Cool. I’ve seen some articles about this topic, but they’re not complete as this. If the language is English, everything seems to be OK, but with other languages, it’s not simple. I’ve tried many methods to take n first characters of a string, but I was not successful. This post helps me a lot. Thank you very much.

  2. I’m liking it. A few minor tweaks and I think it’s something i’ll be implementing in a few of my applications!

    Thanks for the post – good work! 🙂

    Rgds,
    JB

  3. BUT, there were a BUT.. i’ve had a little problem:

    echo toAscii(“Perché l’erba è verde?”, “‘”); // Italian
    didn’t return: perche-l-erba-e-verde
    but it returned:
    Notice: iconv() [function.iconv]: Detected an illegal character in input string in {file} on line {line}
    perch

    so i’ve add a little line.. and it solved (or it seems to solve).
    so here you’re the new function:

    function toAscii($str, $replace=array(), $delimiter='-', $charset='ISO-8859-1') {
    $str = iconv($charset, 'UTF-8', $str); // by lelebart

    if( !empty($replace) ) {
    $str = str_replace((array)$replace, ' ', $str);
    }

    $clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
    $clean = preg_replace("/[^a-zA-Z0-9/_|+ -]/", '', $clean);
    $clean = strtolower(trim($clean, '-'));
    $clean = preg_replace("/[/_|+ -]+/", $delimiter, $clean);

    return $clean;
    }

    hope to be helpul/useful

    1. Tried this but i got:

      “test é another for à and why not ô ?”
      returns
      “test-eacute-another-for-agrave-and-why-not-ocirc”

      Any idea 🙁

    2. Solution on quotes

      function toAscii($str, $replace=array(), $delimiter='-', $charset='ISO-8859-1') {
      $str = str_replace(
      array(chr(145), chr(146), chr(147), chr(148), chr(150), chr(151), chr(133)),
      array("'", "'", '"', '"', '-', '--', '...'),
      $str); // by mordomiamil
      $str = iconv($charset, 'UTF-8', $str); // by lelebart
      if( !empty($replace) ) {
      $str = str_replace((array)$replace, ' ', $str);
      }
      $clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
      $clean = preg_replace("/[^a-zA-Z0-9/_|+ -]/", '', $clean);
      $clean = strtolower(trim($clean, '-'));
      $clean = preg_replace("/[/_|+ -]+/", $delimiter, $clean);
      return $clean;
      }

  4. @lelebart, that depends on the fact that you are probably working in latin-1 instead of utf-8. If you work in iso-8859-1 you can simply replace all references to UTF-8 into ISO-8859-1. You don’t need to convert latin-1 to uft-8 first.

  5. @Matteo Spinelli: thanks. i didn’t know that. ^^
    so i need a simple line like that?
    $clean = iconv(‘ISO-8859-1’, ‘ASCII//TRANSLIT’, $str);

  6. I can’t thank you enough for this wonderful script.

    Before I found this I used to clean urls using str_replace for each kind of special character…this script has saved me tons of time.

    !! 5 STARS !!

  7. Thank you very much, it’s just what I was looking for this morning.

    Just one thing: I don’t understand what you intended to do with:

    $str = str_replace((array)$replace, ‘ ‘, $str);

    and how do you expect to pass the $replace parameter. The default is array() but in the example below you pass “‘” that is a string and in the function i’ts converted to array(“‘”). But what if I pass “‘,;.”, it will be converted into array(“‘,;.”) by (array)$replace and it won’t work as expected in str_replace(). I should pass the $replace parameter directly as an array(“‘”,”,”,”;”,”.”) instead.

    Hope it helps if I got the point, otherwise I didn’t get that part of the code.

    If the above is valid, then if you want to specify the $delimiter using the default $replace empty array(), you may want to modify the function to check for $replace type with is_string or !is_array:

    if(is_string($replace)) { // gonna use default $replace
    $delimiter = $replace;
    $replace = array();
    }

    Bye.

  8. @francesco: yes the $replace parameter must be an array so that you can specify multiple character delimiters (have a look at the custom delimiters example). It shouldn’t be needed to check if the parameter is a string or an array. I convert it to array anyway and it should work.

  9. It’s very nice… but how about website that are not in a US locale?

    setlocale(LC_ALL, ‘en_US.UTF8’);

    It will mess up the settings, is there anyway around this?

  10. @Chris,
    you can use any locale you have installed on your system, I’d suggest to stick with UTF8 though (it_IT.UTF8, fr_FR.UTF8, etc…). If that is not possible remember to change the iconv parameter as well.

  11. In your third and fourth examples, you trim the cleaned string from dashes. Is there a reason for this, or should it be trimmed by the delimiter?

    $clean = strtolower(trim($clean, ‘-‘));

    to

    $clean = strtolower(trim($clean, $delimiter));

  12. Aled, thanks for pointing that out. Actually you’d need to trim for all the following characters: /_|+ –
    It’s just to be sure that the string doesn’t end with a delimiter.

  13. Pingback: b4r7
  14. This is an odd request, but if I wanted to keep UTF-8 characters but continue to strip miscellaneous characters such as ASCII punctuation, how would I modify this function?

    I know it’s not SEO preferred but our niche apps need their native encoding (UTF-8) and not ASCII substitutions. 😉

  15. Sorry for double comment, I tried the original function and it’s simply deleting all our Russian/Chinese/etc. encoded strings. If you could help me with a modification that keeps UTF-8 characters but cleans out punctuation I would be much obliged. =)

  16. @Matteo: Sorry I didn’t see your comment before my last post.

    I’m not sure I need an array, I’m not trying to keep any specific UTF-8 characters. I’m just trying to keep all UTF-8 characters but do the rest of your function (strip punctuation, replace whitespace with dashes, etc. slug preparing). I’m been scouring the net for a function that just strips punctuation/prepares slugs but leaves UTF-8 untouched.

    I found a Python version of what I want but not sure how to convert it to PHP:
    http://semicolons.org/archives/friendly-urls.html

  17. Pingback: slug_me « b4r7
  18. function remove_accent($str)
    {
    $a = array('À','Á','Â','Ã','Ä','Å','Æ','Ç','È','É','Ê','Ë','Ì','Í','Î','Ï','Ð','Ñ','Ò','Ó','Ô','Õ','Ö','Ø','Ù','Ú','Û','Ü','Ý','ß','à','á','â','ã','ä','å','æ','ç','è','é','ê','ë','ì','í','î','ï','ñ','ò','ó','ô','õ','ö','ø','ù','ú','û','ü','ý','ÿ','Ā','ā','Ă','ă','Ą','ą','Ć','ć','Ĉ','ĉ','Ċ','ċ','Č','č','Ď','ď','Đ','đ','Ē','ē','Ĕ','ĕ','Ė','ė','Ę','ę','Ě','ě','Ĝ','ĝ','Ğ','ğ','Ġ','ġ','Ģ','ģ','Ĥ','ĥ','Ħ','ħ','Ĩ','ĩ','Ī','ī','Ĭ','ĭ','Į','į','İ','ı','IJ','ij','Ĵ','ĵ','Ķ','ķ','Ĺ','ĺ','Ļ','ļ','Ľ','ľ','Ŀ','ŀ','Ł','ł','Ń','ń','Ņ','ņ','Ň','ň','ʼn','Ō','ō','Ŏ','ŏ','Ő','ő','Œ','œ','Ŕ','ŕ','Ŗ','ŗ','Ř','ř','Ś','ś','Ŝ','ŝ','Ş','ş','Š','š','Ţ','ţ','Ť','ť','Ŧ','ŧ','Ũ','ũ','Ū','ū','Ŭ','ŭ','Ů','ů','Ű','ű','Ų','ų','Ŵ','ŵ','Ŷ','ŷ','Ÿ','Ź','ź','Ż','ż','Ž','ž','ſ','ƒ','Ơ','ơ','Ư','ư','Ǎ','ǎ','Ǐ','ǐ','Ǒ','ǒ','Ǔ','ǔ','Ǖ','ǖ','Ǘ','ǘ','Ǚ','ǚ','Ǜ','ǜ','Ǻ','ǻ','Ǽ','ǽ','Ǿ','ǿ');
    $b = array('A','A','A','A','A','A','AE','C','E','E','E','E','I','I','I','I','D','N','O','O','O','O','O','O','U','U','U','U','Y','s','a','a','a','a','a','a','ae','c','e','e','e','e','i','i','i','i','n','o','o','o','o','o','o','u','u','u','u','y','y','A','a','A','a','A','a','C','c','C','c','C','c','C','c','D','d','D','d','E','e','E','e','E','e','E','e','E','e','G','g','G','g','G','g','G','g','H','h','H','h','I','i','I','i','I','i','I','i','I','i','IJ','ij','J','j','K','k','L','l','L','l','L','l','L','l','l','l','N','n','N','n','N','n','n','O','o','O','o','O','o','OE','oe','R','r','R','r','R','r','S','s','S','s','S','s','S','s','T','t','T','t','T','t','U','u','U','u','U','u','U','u','U','u','U','u','W','w','Y','y','Y','Z','z','Z','z','Z','z','s','f','O','o','U','u','A','a','I','i','O','o','U','u','U','u','U','u','U','u','U','u','A','a','AE','ae','O','o');
    return str_replace($a, $b, $str);
    }

    function post_slug($str)
    {
    return strtolower(preg_replace(array('/[^a-zA-Z0-9 -]/', '/[ -]+/', '/^-|-$/'), array('', '-', ''), remove_accent($str)));
    }

  19. No me ha funcionado

    function toAscii($str, $replace=array(), $delimiter=’-‘) {
    if( !empty($replace) ) {
    $str = str_replace((array)$replace, ‘ ‘, $str);
    }

    $clean = iconv(’UTF-8′, ‘ASCII//TRANSLIT’, $str);
    $clean = preg_replace(”/[^a-zA-Z0-9/_|+ -]/”, ”, $clean);
    $clean = strtolower(trim($clean, ‘-’));
    $clean = preg_replace(”/[/_|+ -]+/”, $delimiter, $clean);

    return $clean;
    }
    echo toAscii(’Necesito una piñata para mi cumpleaños’), ”;
    RESULT: necesito-una-piata-para-mi-cumpleaos

  20. I’ve been experiencing problems with both the suggested solution by the author (some “strange” characters were not converted, which cutoff the result) and one in the comments by Weblap.ro (list of “strange” characters is not complete). So I decided to combine them and enriched the list of “strange” characters.

    I also added a rtrim and ltrim, so you won’t have an “-” at the beginning or the end when there is a space.

    New function:

    setlocale(LC_ALL, ‘en_US.UTF8’);

    function toAscii($str, $replace=array(), $delimiter=”-“) {
    if( !empty($replace) ) {
    $str = str_replace((array)$replace, ‘ ‘, $str);
    }

    $a = array(‘À’,’Á’,’Â’,’Ã’,’Ä’,’Å’,’Æ’,’Ç’,’È’,’É’,’Ê’,’Ë’,’Ì’,’Í’,’Î’,’Ï’,’Ð’,’Ñ’,’Ò’,’Ó’,’Ô’,’Õ’,’Ö’,’Ø’,’Ù’,’Ú’,’Û’,’Ü’,’Ý’,’ß’,’à’,’á’,’â’,’ã’,’ä’,’å’,’æ’,’ç’,’è’,’é’,’ê’,’ë’,’ì’,’í’,’î’,’ї’,’ñ’,’ò’,’ó’,’ô’,’õ’,’ö’,’ø’,’ù’,’ú’,’û’,’ü’,’ý’,’ÿ’,’Ā’,’ā’,’Ă’,’ă’,’Ą’,’ą’,’Ć’,’ć’,’Ĉ’,’ĉ’,’Ċ’,’ċ’,’Č’,’č’,’Ď’,’ď’,’Đ’,’đ’,’Ē’,’ē’,’Ĕ’,’ĕ’,’Ė’,’ė’,’Ę’,’ę’,’Ě’,’ě’,’ё’,’Ĝ’,’ĝ’,’Ğ’,’ğ’,’Ġ’,’ġ’,’Ģ’,’ģ’,’Ĥ’,’ĥ’,’Ħ’,’ħ’,’Ĩ’,’ĩ’,’Ī’,’ī’,’Ĭ’,’ĭ’,’Į’,’į’,’İ’,’ı’,’ї’,’IJ’,’ij’,’Ĵ’,’ĵ’,’Ķ’,’ķ’,’Ĺ’,’ĺ’,’Ļ’,’ļ’,’Ľ’,’ľ’,’Ŀ’,’ŀ’,’Ł’,’ł’,’Ń’,’ń’,’Ņ’,’ņ’,’Ň’,’ň’,’ʼn’,’Ō’,’ō’,’Ŏ’,’ŏ’,’Ő’,’ő’,’Œ’,’œ’,’Ŕ’,’ŕ’,’Ŗ’,’ŗ’,’Ř’,’ř’,’Ś’,’ś’,’Ŝ’,’ŝ’,’Ş’,’ş’,’Š’,’š’,’Ţ’,’ţ’,’Ť’,’ť’,’Ŧ’,’ŧ’,’Ũ’,’ũ’,’Ū’,’ū’,’Ŭ’,’ŭ’,’Ů’,’ů’,’Ű’,’ű’,’Ų’,’ų’,’Ŵ’,’ŵ’,’Ŷ’,’ŷ’,’Ÿ’,’Ź’,’ź’,’Ż’,’ż’,’Ž’,’ž’,’ſ’,’ƒ’,’Ơ’,’ơ’,’Ư’,’ư’,’Ǎ’,’ǎ’,’Ǐ’,’ǐ’,’Ǒ’,’ǒ’,’Ǔ’,’ǔ’,’Ǖ’,’ǖ’,’Ǘ’,’ǘ’,’Ǚ’,’ǚ’,’Ǜ’,’ǜ’,’ϋ’,’Ǻ’,’ǻ’,’Ǽ’,’ǽ’,’Ǿ’,’ǿ’,’΄’,’ό’,’Α’,’ϊ’,’ฺB’,’Η’,’Ḩ’,’ā’,’ţ’,’ḯ’,’ố’,’ạ’,’ẖ’,’ộ’,’Ḩ’,’ḩ’,’H̱’);
    $b = array(‘A’,’A’,’A’,’A’,’A’,’A’,’AE’,’C’,’E’,’E’,’E’,’E’,’I’,’I’,’I’,’I’,’D’,’N’,’O’,’O’,’O’,’O’,’O’,’O’,’U’,’U’,’U’,’U’,’Y’,’s’,’a’,’a’,’a’,’a’,’a’,’a’,’ae’,’c’,’e’,’e’,’e’,’e’,’i’,’i’,’i’,’i’,’n’,’o’,’o’,’o’,’o’,’o’,’o’,’u’,’u’,’u’,’u’,’y’,’y’,’A’,’a’,’A’,’a’,’A’,’a’,’C’,’c’,’C’,’c’,’C’,’c’,’C’,’c’,’D’,’d’,’D’,’d’,’E’,’e’,’E’,’e’,’E’,’e’,’E’,’e’,’E’,’e’,’e’,’G’,’g’,’G’,’g’,’G’,’g’,’G’,’g’,’H’,’h’,’H’,’h’,’I’,’i’,’I’,’i’,’I’,’i’,’I’,’i’,’I’,’i’,’i’,’IJ’,’ij’,’J’,’j’,’K’,’k’,’L’,’l’,’L’,’l’,’L’,’l’,’L’,’l’,’l’,’l’,’N’,’n’,’N’,’n’,’N’,’n’,’n’,’O’,’o’,’O’,’o’,’O’,’o’,’OE’,’oe’,’R’,’r’,’R’,’r’,’R’,’r’,’S’,’s’,’S’,’s’,’S’,’s’,’S’,’s’,’T’,’t’,’T’,’t’,’T’,’t’,’U’,’u’,’U’,’u’,’U’,’u’,’U’,’u’,’U’,’u’,’U’,’u’,’W’,’w’,’Y’,’y’,’Y’,’Z’,’z’,’Z’,’z’,’Z’,’z’,’s’,’f’,’O’,’o’,’U’,’u’,’A’,’a’,’I’,’i’,’O’,’o’,’U’,’u’,’U’,’u’,’U’,’u’,’U’,’u’,’U’,’u’,’u’,’A’,’a’,’AE’,’ae’,’O’,’o’,”,’o’,’a’,’i’,’b’,’h’,’h’,’a’,’t’,’i’,’o’,’a’,’h’,’o’,’h’,’h’,’h’);
    $clean = str_replace($a, $b, $str);

    $clean = iconv(‘UTF-8’, ‘ASCII//TRANSLIT’, $clean);
    $clean = preg_replace(“/[^a-zA-Z0-9/_|+ -]/”, ”, $clean);
    $clean = strtolower(trim(ltrim(rtrim($clean)), ‘-‘));
    $clean = preg_replace(“/[/_|+ -]+/”, $delimiter, $clean);

    return $clean;
    }

    1. Haha your trim line made me laugh. Trim does ltrim and rtrim, additionally you can add characters to trim to trim certain chars. It was fine how it was, just add a space to the list of chars to trim:

      $clean = strtolower(trim(ltrim(rtrim($clean)), ‘-’));

      Should just be strtolower(trim($clean, ‘ -‘));

  21. One problem throw in some asian characters or any high unicode language and nothing 🙁

  22. It’s all well and good going one way, but you need to be able to go back the other way also.

  23. Hey, just to say that your slug generator function is great!!! The only thing I noticed is that some slugs ended with a “-” when the original title had an empty space at the end (users…) So I moved this line:

    $clean = strtolower(trim($clean, ‘-‘));

    after this one:

    $clean = preg_replace(“/[/_|+ -]+/”, $delimiter, $clean);

    So it’s right before the return statement. Now they’re ok! I wonder if that’s because I replaced in my code “$delimiter” with “-“…
    Many thanks for this nice piece of code!
    Cheers!

  24. A silly question!
    Where should I write the codes! in one. htaccess file or!

    Thanks

  25. I’ve been doing some work around the localization settings and figured out that only the LC_CTYPE requires to be changed.

    Additionaly you can backup and restore the localization settings by doing the following:

    $oldLocale = setlocale(LC_CTYPE, 0);
    setlocale(LC_CTYPE, ‘en_US.UTF8’);
    // Make slug…
    setlocale(LC_CTYPE, $oldLocale);
    // Return slug

  26. Thank you !!!
    My idea was to pass through characters array and change them.
    Thanks again

  27. Hi,

    nice article.

    I’ve not read through all comments, so don’t flame when it’s already been mentioned…

    – You need to make sure that you save your php documents in the right format. So if you use UTF-8 you need to save it as UTF-8. Preferably UTF-8 without BOM to make it work.

    If you don’t know what I’m talking about or you need more info on how to save your files with the correct encoding – please search google for UTF-8 Encoding.

    Greetz from Spain,

    Dev.

  28. Great job man, thanks!
    I just put the code below, before return:
    [code]
    $clean = strtolower(trim($clean, ‘-‘));
    [/code]
    this remove “-” at the ende os string.
    bye bye

  29. Brilliant use of iconv. Kudos.

    I just wanted to add that
    /[^a-zA-Z0-9/_|+ -]/
    can be reduced a little to
    %[^-/+|w ]%

  30. This function deletes dots in filenames. How to preserve them?

  31. This function is great, nice clean and short. Is it possible to have the function change the following, if these are all in the same array…

    foo-bar foo > This I need to change to “foo-barfoo”

    foo bar foo > This I would need to change to “foobarfoo”

    but i can’t figure out how to do it with this function, without changing how it works.

    Can you help?

    1. Interesting.. that’s an odd example but I would do this by counting the blanks within and maybe putting each word into an array.. then associate and replace the blank.

      All depends on whether the bar foo are under your control or not, could become complicated quite quickly.

      Maybe a silly question, but are you just looking to delete the space in all words?

  32. I had got a dream to start my firm, but I didn’t have enough amount of cash to do that. Thank God my close fellow proposed to utilize the business loans. Thence I took the collateral loan and made real my old dream.

  33. Nice, will be using this for cleaning the urls on my new expatriate website. Thanks 🙂

  34. How does it cope with foreign languages like Greek and Russian? Or do I have to add a function that will convert “здраствуйте” into “zdrastvuyte” for example?

  35. Limite char

    function url_slug($str, $replace=array(), $delimiter='-', $maxLength=200) {
    
    	if( !empty($replace) ) {
    		$str = str_replace((array)$replace, ' ', $str);
    	}
    
    	$clean = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
    	$clean = preg_replace("%[^-/+|w ]%", '', $clean);
    	$clean = strtolower(trim(substr($clean, 0, $maxLength), '-'));
    	$clean = preg_replace("/[/_|+ -]+/", $delimiter, $clean);
    
    	return $clean;
    }
    
  36. Why do I get this message? Fatal error: Cannot redeclare toAscii() (previously declared in

    Thank’s

  37. Thanks for this! I’ve edited it a little bit becouse it won’t work correctly for me:

    function parseUrl($sVar) {
    $sDelimiter = ‘-‘;
    $sVar = urldecode($sVar);
    $sVar = iconv(‘UTF-8’, ‘ASCII//TRANSLIT’, $sVar);
    $sVar = preg_replace(“/[^a-zA-Z0-9/_|+ -]/”, ”, $sVar);
    $sVar = strtolower(trim($sVar, ‘-‘));
    $sVar = preg_replace(“/[/_|+ -]+/”, $sDelimiter, $sVar);
    }

  38. Everyone seems to forget about “ø” and “Ø” 🙂
    str_replace(“ø”,”oe”, ..)
    str_replace(“Ø”,”OE”, ..)

    in some cases you may only want
    ø -> o
    Ø -> O

  39. I need to do this on my site. I currently using codeigniters native url generator function by its not good on foreign letters. My recent photos from Cordoba in Argentina are examples of letters going missing.

  40. I think we need to limit the strip end of the sentenses.
    function toAscii($str) {
    $clean = preg_replace("/[^a-zA-Z0-9/_|+ -]/", '', $str);
    $clean = strtolower(trim($clean));
    $clean = preg_replace("/[/_|+ -]+/", '-', $clean);

    return $clean;
    }
    $str="I just say no! #$%^&*";
    print toAscii($str); //i-just-say-no

  41. Marvelous! Just what I needed this very moment.
    I have a client which tried to use foreign characters in a file name during upload.

    So, before adding this to the client site, (it’s online and critical), I just need a confirmation that the code is fully working.

    (Since there are reply’s with edited code)

Comments are closed.