Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot rollback failed #418

Open
1563932024 opened this issue Sep 3, 2024 · 2 comments
Open

Snapshot rollback failed #418

1563932024 opened this issue Sep 3, 2024 · 2 comments

Comments

@1563932024
Copy link

1563932024 commented Sep 3, 2024

When the node unexpectedly shuts down and a snapshot is created, after some time the node returns to normal, but the snapshot rollback fails at this point.

[root@stor1 ~]# linstor n l
╭────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node  ┊ NodeType  ┊ Addresses               ┊ State                                        ┊
╞════════════════════════════════════════════════════════════════════════════════════════════╡
┊ stor1 ┊ SATELLITE ┊ 10.0.0.225:3366 (PLAIN) ┊ Online                                       ┊
┊ stor2 ┊ SATELLITE ┊ 10.0.0.170:3366 (PLAIN) ┊ Online                                       ┊
┊ stor3 ┊ SATELLITE ┊ 10.0.0.240:3366 (PLAIN) ┊ OFFLINE (Auto-eviction: 2024-09-03 17:38:12) ┊
╰────────────────────────────────────────────────────────────────────────────────────────────╯
To cancel automatic eviction please consider the corresponding DrbdOptions/AutoEvict* properties on controller and / or node level
See 'linstor controller set-property --help' or 'linstor node set-property --help' for more details
[root@stor1 ~]#
[root@stor1 ~]#
[root@stor1 ~]#
[root@stor1 ~]# linstor s c test1 snapshot2
WARNING:
    Snapshot for resource 'test1' will not be created on node 'stor3' because that node is currently offline.
SUCCESS:
Description:
    New snapshot 'snapshot2' of resource 'test1' registered.
Details:
    Snapshot 'snapshot2' of resource 'test1' UUID is: 93d33a54-c354-41b7-9d4d-f2b9611c5388
SUCCESS:
    (stor2) Resource 'test1' [DRBD] adjusted.
SUCCESS:
    Suspended IO of '[test1]' on 'stor2' for snapshot
SUCCESS:
    (stor1) Resource 'test1' [DRBD] adjusted.
SUCCESS:
    Suspended IO of '[test1]' on 'stor1' for snapshot
SUCCESS:
    (stor1) Resource 'test1' [DRBD] adjusted.
SUCCESS:
    (stor1) Snapshot [ZFS-Thin] with name 'snapshot2' of resource 'test1', volume number 0 created.
SUCCESS:
    Took snapshot of '[test1]' on 'stor1'
SUCCESS:
    (stor2) Resource 'test1' [DRBD] adjusted.
SUCCESS:
    (stor2) Snapshot [ZFS-Thin] with name 'snapshot2' of resource 'test1', volume number 0 created.
SUCCESS:
    Took snapshot of '[test1]' on 'stor2'
SUCCESS:
    (stor2) Resource 'test1' [DRBD] adjusted.
SUCCESS:
    Resumed IO of '[test1]' on 'stor2' after snapshot
SUCCESS:
    (stor1) Resource 'test1' [DRBD] adjusted.
SUCCESS:
    Resumed IO of '[test1]' on 'stor1' after snapshot
[root@stor1 ~]# linstor s l
╭───────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ SnapshotName   ┊ NodeNames           ┊ Volumes  ┊ CreatedOn           ┊ State      ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ test1        ┊ snapshot2      ┊ stor1, stor2        ┊ 0: 5 GiB ┊ 2024-09-03 17:28:37 ┊ Successful ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────╯

At this point, the node has returned to normal, but the snapshot rollback fails.

[root@stor1 ~]# linstor r l
╭───────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node  ┊ Port ┊ Usage  ┊ Conns ┊    State ┊ CreatedOn           ┊
╞═══════════════════════════════════════════════════════════════════════════════╡
┊ test1        ┊ stor1 ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 2024-09-03 14:28:07 ┊
┊ test1        ┊ stor2 ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 2024-09-03 14:28:07 ┊
┊ test1        ┊ stor3 ┊ 7000 ┊ Unused ┊ Ok    ┊ UpToDate ┊ 2024-09-03 14:28:07 ┊
╰───────────────────────────────────────────────────────────────────────────────╯
[root@stor1 ~]# linstor s rb test1 snapshot2
ERROR:
Description:
    Snapshot 'snapshot2' of resource 'test1' on node 'stor3' not found.
Details:
    Resource: test1, Snapshot: snapshot2
Show reports:
    linstor error-reports show 66D6C9AC-00000-000000
@ghernadi
Copy link
Contributor

ghernadi commented Sep 3, 2024

Hello,

Yes, there are some known limitations of the rollback implementation. We have already a few ideas how this could be improved in the future.

For now, what you can do is to delete the resource temporarily from stor3 node, run the rollback command and re-create the resource on stor3, which will receive the (rolled back) data from the other two nodes.

Alternatively, instead of rollback you could also restore the given snapshot into a new resource, but this approach might not fit your use-case.

@1563932024
Copy link
Author

Thank you for your reply.

I'm very interested in Linstor. Could you please share the approach and plan for addressing this issue (approximately when it will be fixed)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants