XML Parsing with PHP & Python

Awhile ago I attended an interview with Kaweb (I didn’t get the role btw), they asked me, if I did XML processing before, which I said I did XML and HTML processing with DOMDocument, they also asked me if I used XPath, which I said no to, but I have heard of it, I remember saying it’s like Unix directory structures.

Anyhow I just go ahead, the script in PHP & Python.  I didn’t use XPath with Python, only PHP.

PHP (with XPath)

<?php

$dom = new DOMDocument();
$dom->load('http://cj-jackson.com/feed/');

$xpath = new DOMXPath($dom);
$nodes = $xpath->query("//channel/item[position() <=5]");

foreach($nodes as $node) {
	echo $node->getElementsByTagName('title')->item(0)->nodeValue . '<br />';
}

Python (no XPath)

#!/usr/bin/python2.7
from urllib2 import build_opener
from xml.etree.cElementTree import parse as xmlparse

opener = build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0'), ('Accept', '*/*')] # To get round anti-spam system.
source = opener.open('http://cj-jackson.com/feed/')

feed = xmlparse(source).getroot()

for element in feed.findall('channel/item')[:5]:
	print element.findtext('title')

Output

Happy New Year!
RockForums.Co Revisited
Screw dual-boot, Synergy is Awesome!
My Motorbike Got Stolen
No more post for awhile

Conclusion

Python and ElementTree are so elegant, it’s pretty much written in a way that I don’t need to use XPath. As for PHP DOMDocument, at least it’s support HTML Processing as well, with Python I had to use html5lib for HTML Processing, the only problem I have with html5lib is that it’s not come with Python by default unlike ElementTree and cElementTree.

The different between ElementTree and cElementTree, the former written in Python, the latter written in C as the name implies for that reason it’s also the fastest, nothing is faster than C except the speed of light.

Update:

ElementTree does not support XPath, if you want to use XPath in Python use lxml instead, it’s does not come with Python by default.

RockForums.Co 0.95 – Bye Bye Symfony

On the last post I said I had the confidence to rewrite RockForums.Co from scratched, without the use of the Symfony or any other PHP framework, also on the last post I showed you the source code to the URL Routing System, well I also used that source code in RockForums.Co and it works like a charm, well I did make a few modification to the code to get it to work on the production server.

Also my development server which happen to run on Windows 7 and IIS actually let me off with something like “include ‘/../view/example.php’;” but on the production server it didn’t work, it runs on Linux and Apache, I had to add “dirname(__FILE__)” to get it to work on the production server, and still work on the development server.

Neither the less the code on the development server is exactly the same as the production server, because I written a simple if statement which determine the different between development and production therefore load the correct database configuration with no problem.  I couldn’t do that with the Symfony Framework, I had to correct the configuration manually for production and like any other human, I am prone to making mistakes, I could accidently overwrite the configuration with the wrong configuration, because I rewritten the script that won’t happen.

Speaking of Framework, the best PHP Framework there is, is PHP itself, not Symfony, not CakePHP and certainly not Zend Framework, but PHP itself from PHP.net because it simple and does not add too much complexity plus PHP follows the reflection pattern quite well the URL Routing System is an example of a reflection pattern.   I find the Symfony function link_to() ridiculous because all that does is generate a hyperlink, which is very easy to write in html. (<a  class=”link” href=”http://example.com”>Example</a>)

I written in an auto-upgrade script, what that script does is update the tables automatically so I don’t need to modify the tables manually while deploying to production.  HTML5 AV Manager for WordPress also has that script.

I also written an auto embed library called oEmbedder, yes as the name implies it uses oEmbed and it includes support for embedly and plus I opened sourced it and release it into Google Code under the MIT License, available from http://code.google.com/p/oembedder/ .  I am aware of PHP-oEmbed and oEmbed-PHP on Google Code, one was a bit bloated, and the other was simpler but not flexible enough to my taste, both of them had error checking which I find kind of pointless because the thing with json_decode and simple_xml is that they both return false on fail and that all the information I need to know, so basically it either works or it doesn’t just like HDMI Cables, so please don’t buy the expensive ones it just a waste of money.

The forum is at http://rockforums.co , enjoy.

Getting the hang of symfony!

I’m currently working on a new forum script for rockforums.tk using the symfony framework (rolls off my tongue nicely), the reason why I’m doing this so that I can improve on my prospects and have something to show off in my portfolio for the potential employer to see.  Anyway I just off my two pieces of work, which are the controller and the view of a page as well as the output, as for the model and other parts of the code I would rather keep that a closely guarded secret.

Controller

<?php
class contentActions extends sfActions {

    public function executeIndex(sfWebRequest $request) {
        $pager = new sfDoctrinePager('Topic', 10);
        $pager->getQuery();
        $pager->setPage($this->getRequestParameter('page',1));
        $pager->init();

        $this->topics = $pager->getResults();
    }
}
?>

View

Topics!

<?php foreach($topics as $topic) : ?> <?php echo $topic->getID(); ?>
<?php echo $topic->getTitle(); ?>
<?php echo $topic->getSlug(); ?>
<?php echo $topic->getFormattedStartDate(); ?>
<?php echo $topic->getFormattedLastPostDate(); ?>
<?php echo $topic->getIpAddress(); ?>
<?php endforeach; ?>

Output

As you can see, I got the design out of the way and I still have a lot of work to do and please do not ask question, I simply don’t have the bloody time to answer, thank you. But you can give me tips if you want to. ;-)