Duplicates in Database

I have noticed some duplicates in the database.

I wanted to mention it to you so you wouldn’t be surprised to see it.

These are the result — I believe — of there being more than one way to reach some sub-categories.

This also indicates a way that I can improve on the coding of the Amazon API crawler. When I am capturing the “path” of individual sub-categories, I am going off of Amazon’s “path” tree identification as shown in the information about the sub-category, within Amazon’s API.

This allows the sub-category to only show up in the tree that Amazon has defined as primary, instead of the various ways to get to that sub-category.

I will look to see if I can also capture those alternative paths to the destination sub-category.


Bill Platt