Archive for the ‘2621’ Category
Getting Main Domain

Do you want to extract the main domain of a URL? Here is a function that works well and can be modified easily. I found this function while searching for a good Regular Expression to fulfill such result.
<?php</p> <p style="text-align: justify;">function get_base_domain($url) { $debug = 0; $base_domain = '';</p> <p style="text-align: justify;">$G_TLD = array( 'biz','com','edu','gov','info','int','mil','name','net','org','aero','asia','cat','coop','jobs','mobi','museum', 'pro','tel','travel','arpa','root','berlin','bzh','cym','gal','geo','kid','kids','lat','mail','nyc','post','sco','web','xxx', 'nato','example','invalid','localhost','test','bitnet','csnet','ip','local','onion','uucp', 'co' );</p> <p style="text-align: justify;">// country tlds (source: http://en.wikipedia.org/wiki/Country_code_top-level_domain) $C_TLD = array( // active 'ac','ad','ae','af','ag','ai','al','am','an','ao','aq','ar','as','at','au','aw','ax','az', 'ba','bb','bd','be','bf','bg','bh','bi','bj','bm','bn','bo','br','bs','bt','bw','by','bz', 'ca','cc','cd','cf','cg','ch','ci','ck','cl','cm','cn','co','cr','cu','cv','cx','cy','cz', 'de','dj','dk','dm','do','dz','ec','ee','eg','er','es','et','eu','fi','fj','fk','fm','fo', 'fr','ga','gd','ge','gf','gg','gh','gi','gl','gm','gn','gp','gq','gr','gs','gt','gu','gw', 'gy','hk','hm','hn','hr','ht','hu','id','ie','il','im','in','io','iq','ir','is','it','je', 'jm','jo','jp','ke','kg','kh','ki','km','kn','kr','kw','ky','kz','la','lb','lc','li','lk', 'lr','ls','lt','lu','lv','ly','ma','mc','md','mg','mh','mk','ml','mm','mn','mo','mp','mq', 'mr','ms','mt','mu','mv','mw','mx','my','mz','na','nc','ne','nf','ng','ni','nl','no','np', 'nr','nu','nz','om','pa','pe','pf','pg','ph','pk','pl','pn','pr','ps','pt','pw','py','qa', 're','ro','ru','rw','sa','sb','sc','sd','se','sg','sh','si','sk','sl','sm','sn','sr','st', 'sv','sy','sz','tc','td','tf','tg','th','tj','tk','tl','tm','tn','to','tr','tt','tv','tw', 'tz','ua','ug','uk','us','uy','uz','va','vc','ve','vg','vi','vn','vu','wf','ws','ye','yu', 'za','zm','zw', // inactive 'eh','kp','me','rs','um','bv','gb','pm','sj','so','yt','su','tp','bu','cs','dd','zr' );</p> <p style="text-align: justify;">// get domain >if ( !$full_domain = get_url_domain($url) ) { return $base_domain; }</p> <p style="text-align: justify;">// now the fun</p> <p style="text-align: justify;">// break up domain, reverse $DOMAIN = explode('.', $full_domain); if ( $debug ) print_r($DOMAIN); $DOMAIN = array_reverse($DOMAIN); if ( $debug ) print_r($DOMAIN);</p> <p style="text-align: justify;">// first check for ip address >if ( count($DOMAIN) == 4 && is_numeric($DOMAIN[0]) && is_numeric($DOMAIN[3]) ) { return $full_domain; }</p> <p style="text-align: justify;">// if only 2 domain parts, that must be our domain >if ( count($DOMAIN) <= 2 ) return $full_domain;</p> <p style="text-align: justify;">if ( in_array($DOMAIN[0], $C_TLD) && in_array($DOMAIN[1], $G_TLD) && $DOMAIN[2] != 'www' ) { $full_domain = $DOMAIN[2] . '.' . $DOMAIN[1] . '.' . $DOMAIN[0]; } else { $full_domain = $DOMAIN[1] . '.' . $DOMAIN[0];; }</p> <p style="text-align: justify;">// did we succeed? >return $full_domain; }</p> <p style="text-align: justify;">function get_url_domain($url) { $domain = '';</p> <p style="text-align: justify;">$_URL = parse_url($url);</p> <p style="text-align: justify;">// sanity check >if ( empty($_URL) || empty($_URL['host']) ) { $domain = ''; } else { $domain = $_URL['host']; }</p> <p style="text-align: justify;">return $domain;</p> <p style="text-align: justify;">}</p> <p style="text-align: justify;">?>
To test the code we can use the function,
</p> <p style="text-align: justify;">$url = 'http://www.icpep.org'; echo get_base_domain($url) ; // icpep.org</p> <p style="text-align: justify;">
This code really helped me a lot. There are a lot of Regular expressions out there but this simple approach can break all those head breaking expressions. Btw, this function is free of use and is under GNU licensing. Hope this functions is of big help to you also, happy coding!
Did find the post very useful? Maybe you want to buy me a glass of beer!
Source: paparts