Check Box and Scroll Down To Open List Of FAQs
1. What is the capability about?
Welcome to WhereGotText: the first-ever text mining for ecommerce ratings to analyze top occurring words in user-submitted product URLs! This tool has both data scraping and Ratings & Reviews (R&R) analysis capability, so it allows a high degree of flexibility and customization of the report generated. You can submit any product URL from Tmall, Taobao, JD or Amazon to see the R&R scraped and words of highest frequency inside the R&R, to understand what buyers are mentioning most or are most concerned about. This could offer deep insights on how consumers are talking about the product.
2. I am new to WhereGotText, how do I use this?
All you need to do it to find a product URL from Tmall, JD, Taobao or Amazon, then submit it under ‘Submit New Task’, +, enter the URL in the format of http://www.XXX.com/productID, √, enter your preferred title, then submit. Once the analysis is complete, the report will be sent to your email. You can then open the excel file to see the top occurring words, and the frequency count that they appeared in the R&R scraped from the URL, under the tab ‘frequencyList’. You can also find the list of R&R scraped, the dates of these R&R, product URL and SKU size bought in the tab ‘yourData’. The analysis for frequency count is only done for the data found in ‘yourData’. To understand more about the top occurring words and the context that they are used, you can use ctrl+f to find these words inside the original verbatim.
3. How is the data scraping done? Is there a limit to the number of verbatim scraped?
For Tmall and Taobao, the data scraping is for R&R between pages 1 to page 100 by the default sorting (by acknowledgement 按默认). For JingDong, the data scraping is for R&R between pages 1 to 150 by the default recommended sorting (by recommended order推荐排序). For Amazon, we scrape data up to the first 4000 comments by the default sorting on the ecommerce page. The reason why we do not sort comments by recent date is because the comments are usually much shorter and contain far less information, whereas comments by the default recommended sorting 推荐排序 are generally more detailed and of greater value.
4. What is the method used to develop the frequency of mentions plot?
After scraping consumers' verbatim from the ecommerce sites, we remove non-valued added words by using a pre-defined list of stop words, like 'and', 'then', 'the', among others. We then do a word segmentation based on statistics, so if more consumers mentioned a sequence of characters, it will be picked out. These words are compared with a database of standard Chinese words appearing in the Chinese dictionary. For example, our analysis software picked up ‘舒服’, which is then confirmed to be a standard Chinese word, so we included this word in the final report. After segmenting these words, we count the number of times they appeared. We are also working towards including common Chinese words in internet language (but not in the Chinese dictionary) into our database of recognizable Chinese words, so our software can pick up these terms and generate them in the analysis report. We are open to feedback to add more common Chinese words, so please drop us a feedback or suggestion at firstname.lastname@example.org.
5. How do I best make use of the tool?
This everyday tool serves a wide variety of functions, with 3 of the most common uses highlighted below: • Monitor R&R of competitive product to understand a certain benefit space or specific product for benchmarking purpose • Post launch tracking of newly launched products to see if consumers' feedback were consistent with what is expected • Get any raw R&R data at your fingertips to perform other analyses
6. Which platform best works for the tool?
It is recommended to use Google Chrome on desktop. Otherwise, you can run it on internet explorer on desktop or on your mobile phones.
7. Why is there an error when I submit the URL?
Either the ecommerce site is not supported or the format of the URL is wrong. This tool currently only works for Taobao, Tmall, JD or Amazon URLs. If you submitted URLs from these ecommerce sites and still encounter an error, please ensure that you copy and paste the URL exactly as seen in your browser. If you still encounter an error, please email us at email@example.com with your URL and a screenshot of the error on the system.
8. How long does the report take to generate?
Your submission would take up to 48 hours to generate a report. If you submit several requests, this could take longer to process. You can track the status of your submission under ‘My Tasks’ and ‘Status’. If the status indicates running, the report is likely to be generated within 24 hours. If the status indicates processing, then the report is likely to be generated within 48 hours.
9. If I encountered an error with the report, what should I do? Should I resubmit the analysis?
If there are error terms appearing in the frequency list, please email us at with your report submission and the unexpected error terms firstname.lastname@example.org. If there is data missing from the report generated, please email us with the attached affected report, so we can investigate it. You should not resubmit the analysis.
10. Why does the number of R&R scraped differ from the total number shown on the ecommerce URL?
There are 2 possible reasons why the number of R&R scraped differs from the total count on the ecommerce URL. The first possible reason is that total number of R&R shown on the ecommerce website includes R&R without any contents. Our R&R scraping would automatically sieve out these comments without any content, because this data would not be useful for the frequency mention analysis. The second possible reason is that the total number of R&R on the ecommerce site exceeded the maximum number of R&R scraped for each URL. See above section ‘Is there a limit to the number of verbatim scraped?’ for more details.
11. When would words containing 2 Chinese characters be prioritised over those with a single Chinese character?
12. Can I scrape R&R data for only those that are posted during specific time period?
Not for the time being, but we are working on this. In the meantime, you could generate a report to sort the comments by date, then pick up those comments that are posted within your chosen time period.
13. How do I interpret the frequency plot?
14. Why are there repeating terms of similar meaning in the frequency plot?
These words might have similar meaning, but consist of different Chinese characters. Our software would pick up all unique characters and words, so long as these words exist in the Chinese dictionary. We are now working towards a more advanced capability of allowing users to combine terms, which they feel have similar meaning, in the frequency plot. Also refer to above question ‘What is the method used to develop the frequency of mentions plot?’ for more.
15. Can I download the pictures appearing in R&R?
Yes, you can, but we only provide the image links. All you need to do is to manually copy and paste the link onto your browser, then right click on the image that appears to save it.
16. How should I best make use of the report with the picture links?
Suggest searching for specific keywords that you are concerned about using the control+f function, then see the image associated with that keyword. For instance, a skin care brand had complaints on product leakage during shipment, so this tool would enable the project team to look at pictures that consumers uploaded, whenever their R&R contained keywords on product leakage.
17. is there a version of this tool in other languages?
Our tool has English user-interface and runs the analysis on URLs with English or Chinese R&R. There is no translation of the Chinese R&R or report into English, and vice versa, because we keep the original articulation and wording by consumers. You can look for a translation agency to do the translation.