Really thankful to Openlibrary.org(Special thanks to George) to have done a wonderful job of aggregating books data from across the globe. I bulk downloaded the data from their site and now trying to import it into mysql database using a php script. The process has been very dramatic till now. My effort of 2 weeks of importing the data was wasted by my webhosting platform when they decided to rollback the table to a week back 🙂 So now i am running the scripts again. (believe me its very boring)
Now the area of data that i am focusing on is the information regarding the latest books. For that i am planning of working with the the top 10 publishers of boooks and ebooks in US. Namely: RH,T&F,Macmillan,Mcgraw-hill,Thomas Nelson,Wiley,Hachette,Pearson and Simon and Schuster. I believe that if i get regular feed update form these publishers i will be able to cover atleast 80% of the new books published. Some of these publishers like Random house, Thomas Nelson and Wiley have a very well laid out ONIX feed. However, i have not yet heard back from others. I will be surpsised if they dont have similar program.
I am half way through to writing PHP script to read from ONIX feed and Wiley feed is going to be my test data. Hoping to finish it tonight…. More updates tomorrow.
My next big challenge will be to get the coverimages…. I dont want to shell out money to get this from Bowker