Skip to content

Conversation

@pohly
Copy link
Contributor

@pohly pohly commented Dec 10, 2025

What type of PR is this?

/kind bug
/kind failing-test

What this PR does / why we need it:

The kubekins image got updated from containerd 1.7 to 2.2, which broke local-up-cluster.sh in the CI because more recent containerd uses single quotation marks around strings instead of double quotation marks as before. The search/replace with sed no longer matched, causing containerd to fail mounting overlayfs on the default /var/lib/containerd. We have to use the emptyDir host mount under /docker-graph.

The fix is to relax the search term slightly so that it accepts both kinds of quotation marks.

Which issue(s) this PR is related to:

N/A

https://testgrid.k8s.io/conformance-all#local-up-cluster,%20master%20(dev)
https://testgrid.k8s.io/sig-node-dynamic-resource-allocation#ci-dra-integration
https://testgrid.k8s.io/sig-node-dynamic-resource-allocation#ci-dra-integration-1-34
https://testgrid.k8s.io/sig-node-dynamic-resource-allocation#ci-dra-integration-1-35

Special notes for your reviewer:

See also Slack:

This needs to be backported to 1.34 and 1.35 if it doesn't make it into 1.35.0.

Does this PR introduce a user-facing change?

NONE

The kubekins image got updated from containerd 1.7 to 2.2, which broke
local-up-cluster.sh in the CI because more recent containerd uses single
quotation marks around strings instead of double quotation marks as before. The
search/replaced with sed no longer matched, causing containerd to fail mounting
overlayfs on the default /var/lib/containerd. We have to use the emptyDir host
mount under /docker-graph.

The fix is to relax the search term slightly so that it accepts both kinds of
quotation marks.
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. labels Dec 10, 2025
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.35 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.35.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Dec 10 03:39:32 UTC 2025.

@k8s-ci-robot k8s-ci-robot added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 10, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 10, 2025
@pohly
Copy link
Contributor Author

pohly commented Dec 10, 2025

/sig testing

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 10, 2025
@pohly
Copy link
Contributor Author

pohly commented Dec 10, 2025

/test pull-kubernetes-e2e-gce

Some NFS failures (known issue, if I am not mistaken).

/skip

Cluster creation worked again https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/135692/pull-kubernetes-local-e2e/1998669135306821632, but there are some know test failures that prevent a fully successful run.

/assign @dims @BenTheElder @upodroid

I suggest we wait for code thaw, give this some soak time in master, then ask for a backport to release-1.34 and release-1.35.

@dims
Copy link
Member

dims commented Dec 10, 2025

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 10, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 8c18656666c6c07f81f4f6847ee72c76cdc29c9a

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
thanks!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, dims, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-triage-robot
Copy link

Retesting failed PR that otherwise appears ready for merge.

Please help us fix flaky tests by following our Flaky Tests Guide.

Prevent this bot from retesting with /lgtm cancel or /hold.
For this robot's configuration, see here.

/retest-required

@BenTheElder
Copy link
Member

I suggest we wait for code thaw, give this some soak time in master, then ask for a backport to release-1.34 and release-1.35.

I don't think this is a risk for code freeze but I do think it's less than ideal to have one of our test signals remain broken.

This change shouldn't affect end-users, but it does affect our CI and developers, and the change is small.

If the job weren't still failing for other reasons, I'd recommend landing it for sure, as-is I guess I can see waiting.

Are those just flakes, or fundamentally broken tests on local-up-cluster.sh e2e?
/test pull-kubernetes-local-e2e

@k8s-ci-robot
Copy link
Contributor

@pohly: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-local-e2e f58f81d link false /test pull-kubernetes-local-e2e

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@BenTheElder
Copy link
Member

@pohly
Copy link
Contributor Author

pohly commented Dec 11, 2025

I don't think this is a risk for code freeze but I do think it's less than ideal to have one of our test signals remain broken.

So you are suggesting we should ask SIG Release to include this in 1.35? It wouldn't remain broken for long (one more week) if not included.

@BenTheElder
Copy link
Member

So you are suggesting we should ask SIG Release to include this in 1.35? It wouldn't remain broken for long (one more week) if not included.

Yeah, at this point I could go either way, but I think it's harmless to the release (fully contained to a developer script that is not used directly for releasing and not used in production) and not fixing it makes some of our CI useless in the interim. If we merge other changes we're down signal on those, at least not anything critical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants