I am doing a research about how mobile phones evolved over years so I need to create a database with specifications of as many phones is possible. I am trying to scrap data from GSM Arena website.
Example page: http://www.gsmarena.com/samsung_galaxy_note7-8082.php
I am using XPATH that contains the label that precedes each value, example //tr[contains (.,”Sensors”)]/td[2]
But there are some values, last one in category, with no preceding label.
How do I pick this info:
Non-removable Li-Po 3500 mAh battery
or this ino:
Fast battery charging Qi wireless charging (market dependent) ANT+ support S-Voice natural language commands and dictation MP4/DivX/XviD/WMV/H.265 player MP3/WAV/WMA/eAAC+/FLAC player Photo/video editor Document editor
Do note that different phones have different number of rows on page, so using [number] in XPATH would pick different info from
http://www.gsmarena.com/samsung_galaxy_note7-8082.php – need to pick 5th row of features
http://www.gsmarena.com/samsung_sgh_600-49.php – need to pick 8th row of features