<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jens Arps &#187; string</title>
	<atom:link href="http://jensarps.de/tag/string/feed/" rel="self" type="application/rss+xml" />
	<link>http://jensarps.de</link>
	<description></description>
	<lastBuildDate>Tue, 07 Sep 2010 14:11:57 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>More fun with strings: dojo.string.contains()</title>
		<link>http://jensarps.de/2009/11/29/more-fun-with-strings-dojo-string-contains/</link>
		<comments>http://jensarps.de/2009/11/29/more-fun-with-strings-dojo-string-contains/#comments</comments>
		<pubDate>Sun, 29 Nov 2009 19:27:25 +0000</pubDate>
		<dc:creator>Jens Arps</dc:creator>
				<category><![CDATA[Dojo Love]]></category>
		<category><![CDATA[Experiments in Web]]></category>
		<category><![CDATA[Goodies to go]]></category>
		<category><![CDATA[dojo]]></category>
		<category><![CDATA[fun]]></category>
		<category><![CDATA[js]]></category>
		<category><![CDATA[string]]></category>

		<guid isPermaLink="false">http://jensarps.de/?p=114</guid>
		<description><![CDATA[In the series &#8220;convenience wrappers for small tasks that increase code readability&#8221;, today contains() is starring. Having a contains() method could also serve another purpose: to maybe prevent people from using match() to find out if a string contains a given substring (what is still proposed in some JS tutorials out there…). So, I want ]]></description>
			<content:encoded><![CDATA[<p>In the series &#8220;convenience wrappers for small tasks that increase code readability&#8221;, today contains() is starring. Having a contains() method could also serve another purpose: to maybe prevent people from using match() to find out if a string contains a given substring (what is still proposed in some JS tutorials out there…). So, I want the contains() method to also have a switch to work case-insensitive.</p>
<p>Besides indexOf(), there are some other ways to achieve this, so – let&#8217;s have a competition and find out who&#8217;s the fastest!<br />
<span id="more-114"></span></p>
<h3>The Contestants</h3>
<p><strong>replace and length</strong></p>
<p>One possibility is to replace the searched string with an empty string, read the length property of the modified string and compare it to the length of the haystack. If it&#8217;s different, the haystack contains the needle. Reading the length property is extra work, but comparing two integers is faster than comparing two strings, so maybe it&#8217;s faster in general.</p>
<p><strong>replace</strong></p>
<p>Again, replace the searched string with an empty string. Then compare the the modified string with the haystack. If they are different, the haystack contains the needle.</p>
<p><strong>split</strong></p>
<p>Take the haystack and try to split it using the needle as the seperator. If the result&#8217;s length is greater than one, the haystack contains the needle.</p>
<p><strong>indexOf</strong></p>
<p>Find the first occurance of needle in haystack; if the result is something else than -1, the haystack contains the needle.</p>
<h3>Results</h3>
<p>For testing, I did 10,000 iterations and retrieved the execution time in ms. The methods were tested in the order presented above.</p>
<p>On Chromium (Mac build, Version 4.0.203.0 here), replace + length is slightly faster than replace, and both are faster than split. indexOf is by far the fastest.</p>
<p>1) true: 7 / false: 3.5<br />
2) true: 7.5 / false: 3.5<br />
3) true: 10 / false: 8<br />
4) true: 2.5 / false: 3</p>
<p>Safari 4 has nearly the same results as Chromium, but the numbers tend to differ a lot from test to test.</p>
<p>1) true: 4-18, avg 10 / false: 2.5<br />
2) true: 4-19, avg 10 / false: 3<br />
3) true: 6-22. avg 13 / false: 6-23 avg. 16<br />
4) true: 2 / false: 3</p>
<p>On Firefox 3.5, split a bit faster than the two replace methods, but indexOf is again by far the fastest.</p>
<p>1) true: 13 / false 10<br />
2) true: 13 / false 10<br />
3) true: 10 / false 9<br />
4) true: 1.5 / false: 2</p>
<p>On IE 8 (run in a VM), both replace versions perform almost the same, and faster than split. Again, indexOf is fastest. Only 2 &#8211; 3 times faster than the replace methods, but still the fastest.</p>
<p>1) true: 35 / false: 25<br />
2) true: 30 / false: 25<br />
3) true: 60 / false: 50<br />
4) true: 15 / false: 15</p>
<p>You can run the tests yourself, if you are interested, the test page is <a href="http://jensarps.de/tests/dojo_tests/test_Contains.html" target="_blank">here</a>.</p>
<h3>Summary</h3>
<p>The results are pretty obvious: indexOf() outperforms the other contestants. Which is not really a surprise, considering that whatever Javascript does during indexOf() –  it has also to do the same before being able to do a split() or replace(). So, the proposed way for a contains() method is the following:</p>
<pre>dojo.string.contains = function(/* string */ needle, /* string */ haystack, /* bool */ caseInsensitive) {
    if(caseInsensitive) {
        needle = needle.toLowerCase();
        haystack = haystack.toLowerCase();
    }
    return haystack.indexOf(needle) !== -1;
}</pre>
<p>If you want to use contains() in your code, just copy the above lines somewhere in your code. Again, don&#8217;t forget to dojo.require(&#8221;dojo.string&#8221;) before.</p>
<p>Or, if you want to have beginsWith(), endsWith() and contains() all-in-one, use this: <a href="http://jensarps.de/tests/dojo_tests/dojo.string.addons.js" target="_blank">dojo.string.addons.js</a></p>
]]></content:encoded>
			<wfw:commentRss>http://jensarps.de/2009/11/29/more-fun-with-strings-dojo-string-contains/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>dojo.string.beginsWith()</title>
		<link>http://jensarps.de/2009/10/27/dojo-string-beginswith/</link>
		<comments>http://jensarps.de/2009/10/27/dojo-string-beginswith/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 20:36:18 +0000</pubDate>
		<dc:creator>Jens Arps</dc:creator>
				<category><![CDATA[Dojo Love]]></category>
		<category><![CDATA[Goodies to go]]></category>
		<category><![CDATA[dojo]]></category>
		<category><![CDATA[fun]]></category>
		<category><![CDATA[js]]></category>
		<category><![CDATA[string]]></category>

		<guid isPermaLink="false">http://jensarps.de/?p=78</guid>
		<description><![CDATA[Most cases where you find String.substr() in the wild are to check if a given string begins with a certain other string. Be it checking for a prefix or sorting out zipcodes that begin with certain numbers. And because code readability is a good thing (really, it is important), it would be nice to have ]]></description>
			<content:encoded><![CDATA[<p>Most cases where you find String.substr() in the wild are to check if a given string begins with a certain other string. Be it checking for a prefix or sorting out zipcodes that begin with certain numbers. And because code readability is a good thing (really, it <em>is</em> important), it would be nice to have a String.beginsWith() method. Or, because of dojo love, a dojo.string.beginsWith() method.</p>
<p>Consider the following code:</p>
<pre>var nearbyZipcodes = dojo.filter(givenZipcodes,function(zipcode){
    return dojo.string.beginsWith(zipcode,'12');
});</pre>
<p><span id="more-78"></span><br />
No need to tell you what this does, right? So, let&#8217;s do this then! The only thing left is to think about performance: Is there a difference between String.substr() and String.substring()? And can we get faster than the two?</p>
<p>If we have very short needles to look for in our haystack, one could consider the following approach:</p>
<pre>dojo.string.beginsWith = function(/* string */ needle, /* string */ haystack) {
    var i,
        len = needle.length;
    if(needle.length &gt; haystack.length) {
        return false;
    }
    for(i = 0; i &lt; len; i++) {
        if(needle.charAt(i) !== haystack.charAt(i)) {
            return false;
        }
    }
    return true;
}</pre>
<p>For very short needles, or when we expect close to no hits, this might be faster. So I set up a test page and let the different methods run against each other. The results clearly spoke against the char iteration method: On Firefox, iteration was <em>always</em> slower, even when the iteration method could return false after the first character. And when it had to iterate more often, times went up (I somehow had in mind Tracemonkey was perfect for simple iterations – but in this case it won&#8217;t help). Only on Safari the iteration method could compete with native substr() / substring() – but was never significantly faster. So, we&#8217;ll stick to the native methods (there was no real difference between the two).</p>
<p>More convenience?</p>
<p>Depending on your datasource, you might want to trim the input. No problem, as we use dojo, we can use it&#8217;s super fast trim and end up with the following:</p>
<pre>dojo.string.beginsWith = function(/* string */ needle, /* string */ haystack, /* bool */ trimBefore) {
    if(trimBefore) {
        needle = dojo.string.trim(needle)
    }
    if(needle.length &gt; haystack.length) {
        return false;
    }
    return haystack.substr(0,needle.length) === needle;
}</pre>
<p>So simple, so sweet.</p>
<p>You can run the tests for yourself, the page is located here: <a href="http://jensarps.de/tests/dojo_tests/test_beginsWith.html" target="_blank">test_beginsWith.html</a></p>
<p>If you want to use beginsWith in your code, just put the lines above it anywhere in your code – but don&#8217;t forget to dojo.require(&#8217;dojo.string&#8217;) before.</p>
]]></content:encoded>
			<wfw:commentRss>http://jensarps.de/2009/10/27/dojo-string-beginswith/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
