Wondering how to turn your firefox bookmarks into a WordPress blog?

To be ON TOPIC on the QUESTION (firefox) i dont think the approach to use delicious as intermediate step is preferred because:

  1. you lose the hierarhical taxonomy applied in firefox (they way you structured things)
  2. you lose the favicons as gathered in firefox
  3. you lose the information added by dividers between links
  4. you lose information on the order of directories and urls applied in firefox
  5. it is not compatible with other sources of bookmarks e.g. other browsers and e.g. a directory of urls

Therefore my approach is to a) export firefox to a “BOOKMARK” directory structure with each bookmark saved as URL. b) this bookmark directory is the actual heart and can be filled from other browsers, holds the hierarchical information and inside the .url files additonal meta information can be placed.

(At this stage I dropped the divider export in my current code)

From WordPress you can traverse through the directory structure an place it in WP.
What you will notice is that the links as currently in WordPress also lose the applied directory taxonomy (e.g. no hierarchical categories) and have no good meta table therefore I have chosen to make a side table for my link storage to retain this information (see other answer for steps after that).

The following might be the thing needed if you want to focus on exporting firefox first. once again: traversing through a physical directory and then reading this in e.g. a database table (wplinks) is out-of-the-box:

require_once("class-EdlSqliteDb.php");
require_once("class-EdlUtil.php");

class EdlFirefox {

const BOOKMARKTYPE_URL = 1;
const BOOKMARKTYPE_DIRECTORY = 2;
const BOOKMARKTYPE_DIVIDER = 3;
const FMODE = 0777;
const DIVIDER = '--------------';

var $use_cache = true;
var $dbh;

public function __construct($DbLocation, $ffRoot, $exportLocationBookmarks) 
{
    $this->mDbLocation = $DbLocation;
    $this->mRootTitle = $ffRoot;
    $this->mExportLocation = $exportLocationBookmarks;

    // database settings
    $this->dbh = new EdlSqliteDb($DbLocation);
    $this->dbh->addQ(1,"SELECT id FROM moz_bookmarks WHERE title=?");
    $this->dbh->addQ(2,"SELECT id, title, type, fk FROM moz_bookmarks WHERE parent=? ORDER BY position");
    $this->dbh->addQ(3,"SELECT content FROM moz_items_annos WHERE item_id=?");
    $this->dbh->addQ(4,"SELECT url,favicon_id FROM moz_places WHERE id=?");
    $this->dbh->addQ(5,"SELECT data, mime_type FROM moz_favicons WHERE id=?");

    // parse the content
    $this->ParseTree();     
}

/*
 * check the ff database for the the root folder folders
 *
 */
function ParseTree()
{           
    $row = $this->dbh->DbExecutePrepared(1, Array($this->mRootTitle), 'row');
    if (USE_FIREFOX_FOLDER)
    {
        $this->ParsePagesPerTree($row[0], $this->mExportLocation . "https://wordpress.stackexchange.com/". FIREFOX_FOLDER .  "https://wordpress.stackexchange.com/");
    }
    else 
    {
        $this->ParsePagesPerTree($row[0], $this->mExportLocation . "https://wordpress.stackexchange.com/");
    }
    return;
}

/*
 * if a bookmark is a url then write it as a file
 *
 */
function processFFUrl($moz_bookmarks_id, $moz_bookmarks_fk, $moz_bookmarks_title, $strRootFolder)
{
    // (1.1) Get from the annotations the description of the url
    $moz_items_annos_row = $this->dbh->DbExecutePrepared(3, Array($moz_bookmarks_id), 'row');
    $moz_items_annos_description = $moz_items_annos_row[0];

    // (1.2) get the url and favicon_id from moz_places
    if ($moz_places_recordset = $this->dbh->DbExecutePrepared(4, Array($moz_bookmarks_fk), 'recordset'))
    {
        foreach ($moz_places_recordset as $moz_places_row) 
        {
            $moz_places_url        = $moz_places_row[0];
            $moz_places_favicon_id = $moz_places_row[1];
        }
    }   

    $this->getFaviconIcon($moz_places_favicon_id, $moz_places_url);

    // (1.3) create the file
    $link_url_string = "[InternetShortcut]\n";
    $link_url_string .= 'URL=' . $moz_places_url . "\n";
    $link_url_string .= 'description=' . $moz_items_annos_description . "\n";   

    if (!is_file($strRootFolder . "https://wordpress.stackexchange.com/". $moz_bookmarks_title . '.url')) 
    {
        $filename = $strRootFolder . "https://wordpress.stackexchange.com/". $moz_bookmarks_title . '.url';
        $fp = fopen($filename, 'w');
        fwrite($fp, $link_url_string);
        fclose($fp);
    }
}

/*
 * for each logical folder create a physical folder
 *
 */
function parsePagesPerTree($intRootId, $strRootFolder)
{
    if ($moz_bookmarks_recordset = $this->dbh->DbExecutePrepared(2, Array($intRootId), 'recordset'))
    {
        foreach ($moz_bookmarks_recordset as $moz_bookmarks_row)
        {
            $moz_bookmarks_id       = $moz_bookmarks_row[0];
            $moz_bookmarks_title    = EdlUtil::filename_safe($moz_bookmarks_row[1]);
            $moz_bookmarks_type     = $moz_bookmarks_row[2];
            $moz_bookmarks_fk       = $moz_bookmarks_row[3];
            $moz_bookmarks_url="";
            $moz_bookmarks_favicon_id = '';

            // A bookmark can be one of three things: process (1) urls, (2) directories and (3) dividers
            if ($moz_bookmarks_type==self::BOOKMARKTYPE_URL)
            {
                $this->processFFUrl($moz_bookmarks_id, $moz_bookmarks_fk, $moz_bookmarks_title, $strRootFolder);
            } 
            elseif ($moz_bookmarks_type==self::BOOKMARKTYPE_DIRECTORY)
            {       
                $dir = $strRootFolder . "https://wordpress.stackexchange.com/". $moz_bookmarks_title . "https://wordpress.stackexchange.com/";
                if (!file_exists($dir)) 
                {
                    if (!mkdir($dir, 0777, true)) 
                    {
                        die('Failed to create folders...');
                    }                   
                }
                $this->parsePagesPerTree($moz_bookmarks_id, $strRootFolder . "https://wordpress.stackexchange.com/". $moz_bookmarks_title);
            }
            elseif ($moz_bookmarks_type==self::BOOKMARKTYPE_DIVIDER)
            {
                // todo         
            }       
        }
    }   
    return;
}   

//
function getFaviconIcon($moz_bookmarks_favicon_id, $moz_bookmarks_url)
{
    $icon_data="";
    $moz_bookmarks_favicon = '';
    if ($moz_bookmarks_favicon_id)
    {
        if ($moz_favicons_recordset = $this->dbh->DbExecutePrepared(5, Array($moz_bookmarks_favicon_id), 'recordset'))
        {
            foreach ($moz_favicons_recordset as $moz_favicons_row)
            {
                $icon_data      = $moz_favicons_row[0];
                $icon_mime_type = $moz_favicons_row[1];
                // the following array is also defined in the google icon checker!
                $icon_type = array(  'image/png'    => 'a.png',
                                 'image/gif'    => 'a.gif',
                                 'image/x-icon' => 'a.ico',
                                 'image/jpeg'   => 'a.jpg',
                                 'image/bmp'    => 'a.bmp');                                     

                $moz_bookmarks_favicon = $icon_type[$icon_mime_type];
                // TODO reimplement echo 'warning: you should add:' . $icon_mime_type;              

                //if ('http://apps.facebook.com/frontierville/' == $moz_bookmarks_url)
                //{
                //  echo $moz_bookmarks_favicon_id . " - " . $icon_mime_type . " - " .
                //      $moz_bookmarks_favicon;
                //}
            }   
        }   
    }
    // if $moz_bookmarks_favicon = empty then provide weird name
    if ($moz_bookmarks_favicon) {
        $populair_cache = new EdlCache($moz_bookmarks_url, $moz_bookmarks_favicon);
        $obj = $populair_cache->CheckCacheData($icon_data, FILECACHE_FIREFOX, false);
    }

    // we dont want to return the data it only needs to be update
    return;         
}
 } 

I hope I can give you clues to take this a step further.

1) What I wanted is an easier way to manage my bookmarks and enrich it with information that is already out there.
2) the different systems out there like Alexa, Delcious, StumbleUpon etc… not all give the information on a url level e.g. Alexa gives information on a higher level in the domain structure e.g. abc.def.com : you need def.com for the ranking or abc.def.com/whatever/rtc.php : for this you e.g. need def.com/user (like youtube). So you need both the domain structure and the relative url structure and have each single node of every possible url (both domain and relative) as an entry in the database to be able later to represent this and enrich this and you need the relations between each part in the url to be able to represent it.

  1. I have written a class that loads in the official TLD structure and used this as root items in my database. So .uk gets the id 1 and .co.uk get a parent id 1. I used both the Mozilla list and other: source is the Mozilla Public Suffix List: http://publicsuffix.org/ but that is a bit outdated so you need to add to it

  2. Now that I have the official TLD’s in there I have class that loads in the Alexa top 1.000.000 sites. This will link in the same way. Many of these are a sort of unofficial top level TLD’s. Since e.g. “google.com” is not as official as the TLD of some country but it seems more important. By doing that you will discover some patterns but also some exceptions e.g. you will find IP addresses that are populair. Each of these entries fills the field “Alexa Ranking”. (for performance I first load the .csv in a help table)
    Alexa will force you to review the patterns so that is good (a good test set)

  3. I have written a class that traverses my Firefox Databases (sql lite) and exports all urls in there as .URL in a hierarchical directory structure. It also exports the favicons whether they be .ico, .png, .gif etc… (see below). This is also read in the database. Since I update this a lot it sycns with the database described in 1 and 2. (in the beginning I also exported the dividers but I stopped doing that).

  4. I have begun to just drag and drop bookmarks in this directory structure from other browsers e.g. from chrome I just drag a bookmark from the browser to the directory which also delivers that .url file. The directory structure of URL’s I have given extra properties e.g. (h) at the beginning of the name will lead to a “heart” e.g. one URL I particuarly like and #01# will place it at the top (or at least thats the way code further on handles it). I have placed this directory structure in a dropbox. I still have to write the code on the server to be in constant sync so the dropbox server part. (My WORDPRESS on the server counterpart constantly reads the URL directory structure for syncing and updating bookmarks as above but i now use ftp sync)

  5. I have written classes for delicious (you need a MD5) and StumbleUpon to get not only the ranking (delicious=amount of bookmarks) (SU= amount of reviews and amount of pageviews) but also the TAGS and the description people use (why should I invent my own tags if people have already given them). Since you have a limit amount of calls you can do to these systems you need to spread it over time to enrich your database. (if you now go to delicious and look up a link see the right side and get an idea about the taxonomy of tags given to links)

  6. I use the Google favicons provider (const GOOGLE_ICON_URL = ‘http://www.google.com/s2/favicons?domain=’;) to show the favicons BUT since Google does not have all the icons (e.g. not for facebook applications) I enrich the cache with the icons I exported from Firefox. For that you need a priority system build in which chooses the correct favicon over the other.

  7. To cache this I have a caching structure that looks like the domain reversed for the . parts e.g. .com.facebook.apps.geochallenge and on a deeper level the relative path structure. In eeach directory in that cache structure I store the cache favicons. In a previous release I also store there the results of the calls to delicious and stumblupon.

It seems that this is WordPress out-of-scope but in fact (grin) it is very in-scope. The build-in link functionality has no good meta options / no meta table and it has some more restrictions like no hierarchical categories etc… Also you need to type in information in it itself while there are a lot of services which already categorize urls (e.g. dmoz) and give it tags etc… which have become sort of default.

So this lies “under” my WordPress site for handling my links.

I am making this setup with all the information of at least the top 1.000.000 plus sites to make the amount of calls less and to later on share this as a plugin. I have a stumbleupon plugin in the wp plugin db and that led to this. It can give you information on the external links you have in your weblog. There are a lot of plugins that give you information on SEO general but none which show you reports and comparions of e.g. ´what percentage of your outgoing links is in which category or popular or not etc..´. It also gives rankings to your incoming links and outgoing links etc. etc.

Leave a Comment