post header image for URI Parser For HaXe

URI Parser For HaXe

11th April 2011

Continuing on my theme of the moment haXe, I have another post  regarding the development of my haXe rewrite of  ChromeCrawler.

I was in need of a way to split a URL into its various parts. To do this in previous versions of ChromeCrawler I used a ready built one I found on the web.

I thought it should be a fairly simple matter to port this to haXe, unfortunately however this wasn't the case. The problem was that haXe, unlike JS, doesnt have the exec() method on its regular expression function. What this meant is that the URL couldnt be split in the same way.

Confused I jumped on the haXe IRC, unfortunately the solutions the kind people there provided didnt work. Instead I posted a message on the mailing list and within a few hours I had my answer. The solution was to use EReg.match() then EReg.matched() to get each part.

Anyways, I promised to share the code when I was done so here it is:

package utils; import haxe.Http; /\*\* - ... - \*/ class URLParser { // Publics public var url : String; public var source : String; public var protocol : String; public var authority : String; public var userInfo : String; public var user : String; public var password : String; public var host : String; public var port : String; public var relative : String; public var path : String; public var directory : String; public var file : String; public var query : String; public var anchor : String; // Privates inline static private var _parts : Array<String> = ["source","protocol","authority","userInfo","user","password","host","port","relative","path","directory","file","query","anchor"]; public function new(url:String) { // Save for 'ron this.url = url; // The almighty regexp (courtesy of var r : EReg = ~/^(?:(?![^:@]+:[^:@/]*@)([^:/?#.]+):)?(?://)?((?:(([^:@]*)(?::([^:@]*))?)?@)?([^:/?#]*)(?::(d*))?)(((/(?:[^?#](?![^?#/]*.[^?#/.]+(?:[?#]|$)))*/?)?([^?#/]*))(?:?([^#]*))?(?:#(.*))?)/; // Match the regexp to the url r.match(url); // Use reflection to set each part for (i in 0..._parts.length) { Reflect.setField(this, _parts[i], r.matched(i)); } } public function toString() : String { var s : String = "For Url -> " + url + "n"; for (i in 0..._parts.length) { s += _parts[i] + ": " + Reflect.field(this, _parts[i]) + (i==_parts.length-1?"":"n"); } return s; } public static function parse(url:String) : URLParser { return new URLParser(url); } }

So for example the following use:

trace(new URLParser(""));

Will print the following:

For Url ->
protocol: http
userInfo: undefined
user: undefined
password: undefined
port: undefined
relative: /programming/haxe/haxe-jqueryextern-gotcha?somevar=1242#home
path: /programming/haxe/haxe-jqueryextern-gotcha
directory: /programming/haxe/haxe-jqueryextern-gotcha
query: somevar=1242
anchor: home


Im not sure how performant the reflection usage would be on the various platforms haXe targets but atleast it would work and its fairly elegant to boot ;)

Edit: Thank you Adrian Cowen for posting this as a haXe snippet: