重链聚类 - 問題

#12765. 重链聚类

統計

AI Translation — This statement was translated using AI and may contain inaccuracies. Please refer to the original statement if in doubt.

一组生物学家正在尝试寻找一种病毒性疾病的治疗方法。他们尝试了许多来自不同来源、可能对抗病毒抗原的抗体，并从中挑选出了 $n$ 个在实验中表现最好的抗体。

每个抗体由其重链（heavy chain）标识，重链是一个氨基酸序列。

如果一组抗体满足以下至少一个条件，则它们构成一个相似簇（similarity cluster）： 所有抗体重链的 $k$-前缀（前 $k$ 个氨基酸）相同； 所有抗体重链的 $k$-后缀（后 $k$ 个氨基酸）相同。

为了简化后续研究，生物学家希望将抗体分组为相似簇。你需要将给定的抗体划分为最少数量的相似簇。

输入格式

第一行包含两个整数 $n$ 和 $k$ —— 抗体的数量以及需要匹配的氨基酸序列长度（$1 \le n \le 5000$，$1 \le k \le 550$）。

接下来的 $n$ 行包含构成抗体重链的氨基酸序列。每个氨基酸用一个大写英文字母表示。每条重链包含至少 $k$ 个且不超过 $550$ 个氨基酸。

输出格式

第一行输出一个整数 —— 相似簇的最小数量。

接下来的每一行包含一个簇的描述。

每个描述以 $m_i$ 开头 —— 该簇中抗体的数量，随后是 $m_i$ 个整数 —— 这些抗体的编号。抗体按输入顺序从 $1$ 开始编号。

每个抗体必须恰好出现在一个簇中。

样例

样例输入 1

4 1
AA
AB
BB
BA

样例输出 1

2
2 1 2
2 3 4

样例输入 2

3 2
ABA
BAB
XY

样例输出 2

Discussions

About Discussions

The discussion section is only for posting: General Discussions (problem-solving strategies, alternative approaches), and Off-topic conversations.

This is NOT for reporting issues! If you want to report bugs or errors, please use the Issues section below.

Open Discussions 0

No discussions in this category.

Issues

About Issues

If you find any issues with the problem (statement, scoring, time/memory limits, test cases, etc.), you may submit an issue here. A problem moderator will review your issue.

Guidelines:

This is not a place to publish discussions, editorials, or requests to debug your code. Issues are only visible to you and problem moderators.
Do not submit duplicated issues.
Issues must be filed in English or Chinese only.

Active Issues 0

No issues in this category.

Closed/Resolved Issues 0