热门话题 - Problem

假设你正在为一家主要业务是分析网络信息数据的公司进行招聘。测试内容之一是编写一个程序，用于维护一组最新的热门话题。你是否被录用取决于你解决方案的效率。

公司为你提供来自最活跃博客的文本。文本按天组织，当收到查询时，你需要提供过去 7 天内出现频率最高的 $N$ 个单词的排序列表。

输入格式

每个输入文件包含一个测试用例。某一天的文本由标签 <text> 界定。热门单词查询（Top $N$）可能出现在对应于不同两天的文本之间。Top $N$ 查询以 <top 10 /> 这样的标签形式出现。为了方便你从输入中读取，数字总是由空格分隔，如样例所示。

说明： 所有单词仅由小写字母组成，长度最多为 20。 不同单词的最大数量为 20000。 每天的最大单词数为 20000。 长度小于 4 个字符的单词被视为无意义，不予考虑。 天数最多为 1000。 $1 \le N \le 20$

输出格式

当收到查询时，必须显示过去 7 天内出现频率最高的 $N$ 个单词。单词必须按频率降序排列，如果频率相同，则按字母顺序排列。必须显示所有出现次数等于排名第 $N$ 位的单词的词，即使要显示的单词总数超过 $N$。

样例

输入 1

<text>
imagine you are in the hiring process of a company whose
main business is analyzing the information that appears
in the web
</text>
<text>
a simple test consists in writing a program for
maintaining up to date a set of trending topics
</text>
<text>
you will be hired depending on the efficiency of your solution
</text>
<top 5 />
<text>
they provide you with a file containing the text
corresponding to a highly active blog
</text>
<text>
the text is organized daily and you have to provide the
sorted list of the n most frequent words during last week
when asked
</text>
<text>
each input file contains one test case the text corresponding
to a day is delimited by tag text
</text>
<text>
the query of top n words can appear between texts corresponding
to two different days
</text>
<top 3 />
<text>
blah blah blah blah blah blah blah blah blah
please please please
</text>
<top 3 />

输出 1

<top 5>
analyzing 1
appears 1
business 1
company 1
consists 1
date 1
depending 1
efficiency 1
hired 1
hiring 1
imagine 1
information 1
main 1
maintaining 1
process 1
program 1
simple 1
solution 1
test 1
that 1
topics 1
trending 1
whose 1
will 1
writing 1
your 1
</top>
<top 3>
text 4
corresponding 3
file 2
provide 2
test 2
words 2
</top>
<top 3>
blah 9
text 4
corresponding 3
please 3
</top>

QOJ.ac

QOJ

#5347. 热门话题

输入格式

输出格式

样例

输入 1

输出 1

Discussions

About Discussions

Open Discussions 0

Issues

About Issues

Active Issues 0

Closed/Resolved Issues 0