Identifying elements from data feeds generated by affiliate sites
Posted
by
SPI
on Programmers
See other posts from Programmers
or by SPI
Published on 2012-08-05T12:24:42Z
Indexed on
2012/12/12
5:16 UTC
Read the original article
Hit count: 176
I am working with data feeds from affiliate sites. The basic idea is to provide an interface where the user can paste a link to an XML datafeed (these are huge btw, around 60 mb) that would then be streamed, parsed into small chunks, and mined for the required data which would then be stored in the database.
The problem is that different affiliate sites have different Schemas for their XML's. It is a little hard mapping the elements in an XML to your database attributes when you don't actually know which element contains what.
My Solution: Use XPath to traverse through the first set of parent and it's descendent's, fetch the elements as well as the data and and ask the user to map this data to the attributes in the database by selecting from a set of radio buttons that represent the attributes from the database. This will be done just once for each new Feed, once the system know's what's what it will automatically upload the data from the XML to the database.
Does this sound viable? Is there a better solution? I realize this leaves an uncomfortable opening for human error..
Thanks.
© Programmers or respective owner