<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Defvayne23.com &#187; Thursday Code Tip</title>
	<atom:link href="http://www.defvayne23.com/tag/thursday-code-tip/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.defvayne23.com</link>
	<description>A blog by John Hoover</description>
	<lastBuildDate>Sat, 12 Jun 2010 14:53:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>TCT: Crawl a Website</title>
		<link>http://www.defvayne23.com/2010/02/18/tct-crawl-a-website/</link>
		<comments>http://www.defvayne23.com/2010/02/18/tct-crawl-a-website/#comments</comments>
		<pubDate>Thu, 18 Feb 2010 16:51:32 +0000</pubDate>
		<dc:creator>John Hoover</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Thursday Code Tip]]></category>
		<category><![CDATA[Tip]]></category>

		<guid isPermaLink="false">http://www.defvayne23.com/?p=44</guid>
		<description><![CDATA[asdasf]]></description>
			<content:encoded><![CDATA[<p><strong>DISCLAIMER:</strong> I would like to say I do not condone doing this. Better ways, more legal, ways to get content from someone. But sometimes this is asked of you by your boss. DO NOT STEAL CONTENT.</p>
<p>For this weeks Thursday Code Tip I will show how to use PHP to crawl a website to gather content. First we start by selecting the URL to crawl:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$sURL</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;http://www.defvayne23.com/&quot;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Next we get the content of the page:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>2
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$sContent</span> <span style="color: #339933;">=</span> <span style="color: #990000;">file_get_contents</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$sURL</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Now to use <a title="REGEX" href="http://us.php.net/manual/en/book.pcre.php">REGEX</a> to get what we want. You can learn patterns <a href="http://geekswithblogs.net/brcraju/articles/235.aspx" title="REGEX">here</a>. Below we search for the text within a H1 tag.</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>3
4
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$sPattern</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'/&lt;h1&gt;([a-z0-9\s]+)&lt;\/h1&gt;/i'</span><span style="color: #339933;">;</span>
<span style="color: #990000;">preg_match</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$sPattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$sContent</span><span style="color: #339933;">,</span> <span style="color: #000088;">$aMatches</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>The above won&#8217;t return anything because I link all my H1&#8242;s. So lets modify it so it will account for the links, but not gather them.</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>3
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$sPattern</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'/&lt;h1&gt;&lt;a [^&gt;]+&gt;([a-z0-9\s]+)&lt;\/a&gt;&lt;\/h1&gt;/i'</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Now that we account for the anchor the above should return:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #990000;">Array</span>
<span style="color: #009900;">&#40;</span>
    <span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=&gt;</span> Defvayne23
    <span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=&gt;</span> Defvayne23
<span style="color: #009900;">&#41;</span></pre></td></tr></table></div>

<p>The first part of the array is the HTML it found including the h1 and anchor. The second is just the text that we where looking for.</p>
<p>Here it is all together:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$sURL</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;http://www.defvayne23.com/&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$sContent</span> <span style="color: #339933;">=</span> <span style="color: #990000;">file_get_contents</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$sURL</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$sPattern</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">'/&lt;h1&gt;&lt;a [^&gt;]+&gt;([a-z0-9\s]+)&lt;\/a&gt;&lt;\/h1&gt;/i'</span><span style="color: #339933;">;</span>
<span style="color: #990000;">preg_match</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$sPattern</span><span style="color: #339933;">,</span> <span style="color: #000088;">$sContent</span><span style="color: #339933;">,</span> <span style="color: #000088;">$aMatches</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$sHeader</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$aMatches</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://www.defvayne23.com/2010/02/18/tct-crawl-a-website/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thursday Code Tip: Understand Classes</title>
		<link>http://www.defvayne23.com/2010/02/11/thursday-code-tip-understand-classes/</link>
		<comments>http://www.defvayne23.com/2010/02/11/thursday-code-tip-understand-classes/#comments</comments>
		<pubDate>Thu, 11 Feb 2010 17:48:56 +0000</pubDate>
		<dc:creator>John Hoover</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Kaizen CMS]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Thursday Code Tip]]></category>
		<category><![CDATA[Tip]]></category>

		<guid isPermaLink="false">http://www.defvayne23.com/?p=40</guid>
		<description><![CDATA[asdfa]]></description>
			<content:encoded><![CDATA[<p>Working with classes in PHP is not only great for keeping your framework clean, but handling an item with ease. The <a href="http://github.com/defvayne23/Kaizen-CMS/blob/master/helpers/makeImage.helper.php">makeImage</a> helper for <a title="GitHub" href="http://github.com/defvayne23/Kaizen-CMS/">Kaizen CMS</a> makes use of this. It&#8217;s easy to make changes to the image after you create the class object as below. Below code re-sizes an image, then crops to fit within an image restraint.</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
</pre></td><td class="code"><pre class="php" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// Pass the file path and option to cache the image to the class</span>
<span style="color: #000088;">$oImage</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> makeImage<span style="color: #009900;">&#40;</span><span style="color: #000088;">$sFile</span><span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Tell the object to resize the image, no need to pass the image</span>
<span style="color: #000088;">$oImage</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">resize</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">200</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">100</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Now crop the image after it was resized</span>
<span style="color: #000088;">$oImage</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">crop</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">100</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Now send the image data to the user</span>
<span style="color: #000088;">$oImage</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">draw</span><span style="color: #009900;">&#40;</span><span style="color: #009900; font-weight: bold;">null</span><span style="color: #339933;">,</span> <span style="color: #cc66cc;">100</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Class handled all the dirty work of keeping up with the image, and knowing all the info about the image. Above code is easier to maintain then doing all the code straight in the file. Class used above is available <a title="GitHub" href="http://github.com/defvayne23/Kaizen-CMS/blob/master/helpers/makeImage.helper.php">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.defvayne23.com/2010/02/11/thursday-code-tip-understand-classes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
