Hi Haneul Kim, Thanks for your time for the read.

This is a very good question, let’s see the simple definition of TF-IDF again:

A High weight in TF-IDF is reached by a high term frequency(in the given document) and a low document frequency of the term in the whole collection of documents.

That means, we just need to find the frequency of document with a word appears in it, thus we should consider it as one (no matter how many times word appeared in one document).

Hope I’ve answered your question.

Best Regards, Akash



SDE 2 @Amazon.

