The goal of content cleansing is to ensure we pass the cleanest possible signal into our interest graph service. It is a multiple step process that involves:
You can check out the results of the content cleansing phase for your user's 10 articles below. To learn more, head over to our Goose Content Extractor Labs page
Now that we have the data for each article, we run them through our Interest Graphing process.
Now that we've graphed all the articles the user has viewed, we merge them together and prune them to create the user's interest graph.