遗传相似性 - 题目

#11143. 遗传相似性

统计

AI Translation — This statement was translated using AI and may contain inaccuracies. Please refer to the original statement if in doubt.

我们考虑 DNA 序列，即仅由字符 A、C、G 和 T 组成的字符串。如果通过删除字符串 $s'$ 中的某些（可能为零个）字符可以得到字符串 $s$，则称 $s$ 是 $s'$ 的子序列。例如，AGG 是 TAGAAG 的子序列，而 AGGA 不是。

对于两个字符串 $s_1, s_2$，我们定义它们的 $m$-相似度为长度为 $m$ 的序列 $w$ 的数量，使得 $w$ 是 $s_1$ 的子序列当且仅当 $w$ 是 $s_2$ 的子序列。换句话说，这是长度为 $m$ 且同时是两个字符串的子序列，或者同时不是两个字符串的子序列的序列数量。

给定字符串 $s_1, s_2$ 和整数 $m$，计算 $s_1$ 和 $s_2$ 的 $m$-相似度。

输入格式

输入包含三行。前两行分别为序列 $s_1$ 和 $s_2$，每个序列包含至少 1 个且最多 1,000,000 个来自集合 $\{A, C, G, T\}$ 的字符。第三行包含一个整数 $m$ ($1 \le m \le 20$)。

输出格式

输出一个整数：$s_1$ 和 $s_2$ 的 $m$-相似度。

样例

样例输入 1

TCAGG
TAGAAG
2

样例输出 1

说明 1

为了计算 TCAGG 和 TAGAAG 的 2-相似度，需要考虑所有 $4^2 = 16$ 种可能的双字母序列。其中三个 (CA, CG 和 TC) 仅是第一个序列的子序列，两个 (AA 和 GA) 仅是第二个序列的子序列，四个 (AG, GG, TA, TG) 是两个序列的子序列，其余七个 (AC, AT, CC, CT, GC, GT 和 TT) 均不是任何一个序列的子序列。因此，所求的 2-相似度为 $4 + 7 = 11$。

样例输入 2

T
AG
3

样例输出 2

Discussions

About Discussions

The discussion section is only for posting: General Discussions (problem-solving strategies, alternative approaches), and Off-topic conversations.

This is NOT for reporting issues! If you want to report bugs or errors, please use the Issues section below.

Open Discussions 0

No discussions in this category.

Issues

About Issues

If you find any issues with the problem (statement, scoring, time/memory limits, test cases, etc.), you may submit an issue here. A problem moderator will review your issue.

Guidelines:

This is not a place to publish discussions, editorials, or requests to debug your code. Issues are only visible to you and problem moderators.
Do not submit duplicated issues.
Issues must be filed in English or Chinese only.

Active Issues 0

No issues in this category.

Closed/Resolved Issues 0