Skip to content

Free everything at exit #8868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 7, 2023
Merged

Conversation

HParker
Copy link
Contributor

@HParker HParker commented Nov 8, 2023

when the FREE_ON_SHUTDOWN environment variable is enabled, manually free memory at shutdown.

https://bugs.ruby-lang.org/issues/19993

without FREE_ON_SHUTDOWN:

valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes -- ./miniruby basictest/test.rb 
==573270== LEAK SUMMARY:
==573270==    definitely lost: 658,324 bytes in 5,387 blocks
==573270==    indirectly lost: 955,708 bytes in 11,957 blocks
==573270==      possibly lost: 2,071,096 bytes in 12 blocks
==573270==    still reachable: 161,881 bytes in 275 blocks
==573270==         suppressed: 0 bytes in 0 blocks
==573270== Reachable blocks (those to which a pointer was found) are not shown.
==573270== To see them, rerun with: --leak-check=full --show-leak-kinds=all

with FREE_ON_SHUTDOWN=1

FREE_ON_SHUTDOWN=1 valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes -- ./miniruby basictest/test.rb 
==10258== 120 bytes in 1 blocks are definitely lost in loss record 1 of 1
==10258==    at 0x484A993: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==10258==    by 0x233734: calloc1 (gc.c:1900)
==10258==    by 0x233734: objspace_xcalloc (gc.c:12709)
==10258==    by 0x233734: ruby_xcalloc_body (gc.c:12716)
==10258==    by 0x40538C: Init_BareVM (vm.c:4184)
==10258==    by 0x20B0C6: ruby_setup (eval.c:82)
==10258==    by 0x20B24C: ruby_init (eval.c:101)
==10258==    by 0x1595B3: rb_main (main.c:38)
==10258==    by 0x1595B3: main (main.c:58)
==10258== 
==10258== LEAK SUMMARY:
==10258==    definitely lost: 120 bytes in 1 blocks
==10258==    indirectly lost: 0 bytes in 0 blocks
==10258==      possibly lost: 0 bytes in 0 blocks
==10258==    still reachable: 0 bytes in 0 blocks
==10258==         suppressed: 0 bytes in 0 blocks

@mame
Copy link
Member

mame commented Nov 8, 2023

@HParker Please add static for internal function definitions.

  Checking leaked global symbols...leaked
    free_shared_fiber_pool
    free_global_enc_table
    free_warning
    free_encoded_insn_data
    free_loaded_features_index
    free_compat_allocator_table
    free_default_rand_key
    free_static_symid_str
    free_transcoder_table
    free_rb_global_tbl
    free_generic_iv_tbl_
    free_vm_opt_tables
    free_on_shutdown

@HParker HParker force-pushed the add-free-everything-flag branch from 15d6556 to 8435b9c Compare November 8, 2023 07:16
@HParker
Copy link
Contributor Author

HParker commented Nov 8, 2023

oh! I misunderstood the failure and added rb_ infront of the names 🤦 thank you! I'll switch to static instead.

@HParker
Copy link
Contributor Author

HParker commented Nov 8, 2023

Actually @mame, am I able to use static? I think the functions are declared in different files so C will complain if I use static right? maybe there is a way around this I don't know about?

@HParker HParker force-pushed the add-free-everything-flag branch from 8435b9c to 998ed24 Compare November 8, 2023 07:46
@mame
Copy link
Member

mame commented Nov 8, 2023

@HParker Sorry, if they need to refer across files, it is OK to add rb_ prefix.

@HParker HParker force-pushed the add-free-everything-flag branch from 998ed24 to 2db85c4 Compare November 8, 2023 08:02
internal/vm.h Outdated
void rb_free_global_enc_table(void);
// rb_free_loaded_builtin_table needs a week definition in case minibuiltin.c is not used
// this is required for the wasm build.
void rb_free_loaded_builtin_table() __attribute__((weak));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like __attribute__((weak)) is not supported in Windows, but removing this break the wasm build.

Maybe there is another builtin.c equivalent place I need to define this so that wasm has it and Windows is happy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's all I can think of. I haven't tried but maybe:

diff --git a/builtin.c b/builtin.c
index aef5b2c2d4..c0cd0f6e50 100644
--- a/builtin.c
+++ b/builtin.c
@@ -57,6 +57,12 @@ rb_load_with_builtin_functions(const char *feature_name, const struct rb_builtin

 #endif

+void
+rb_free_loaded_builtin_table(void)
+{
+    // do nothing
+}
+
 void
 Init_builtin(void)
 {
diff --git a/internal/vm.h b/internal/vm.h
index 059c106e44..94434048da 100644
--- a/internal/vm.h
+++ b/internal/vm.h
@@ -91,7 +91,7 @@ void rb_free_environ(void);
 void rb_free_global_enc_table(void);
 // rb_free_loaded_builtin_table needs a week definition in case minibuiltin.c is not used
 // this is required for the wasm build.
-void rb_free_loaded_builtin_table() __attribute__((weak));
+void rb_free_loaded_builtin_table();
 void rb_free_rb_global_tbl(void);
 void rb_free_shared_fiber_pool(void);
 void rb_free_static_symid_str(void);
diff --git a/mini_builtin.c b/mini_builtin.c
index 4926afd031..a0bb46c54c 100644
--- a/mini_builtin.c
+++ b/mini_builtin.c
@@ -10,13 +10,6 @@

 #ifndef INCLUDED_BY_BUILTIN_C
 static struct st_table *loaded_builtin_table;
-
-void
-rb_free_loaded_builtin_table(void)
-{
-    if (loaded_builtin_table)
-        st_free_table(loaded_builtin_table);
-}
 #endif

 rb_ast_t *rb_builtin_ast(const char *feature_name, VALUE *name_str);
diff --git a/miniinit.c b/miniinit.c
index 2a14a0d1c5..bb22d6d9f5 100644
--- a/miniinit.c
+++ b/miniinit.c
@@ -49,3 +49,10 @@ Init_enc(void)
 }

 #include "mini_builtin.c"
+
+void
+rb_free_loaded_builtin_table(void)
+{
+    if (loaded_builtin_table)
+        st_free_table(loaded_builtin_table);
+}

@ko1 @nobu Do you have a better idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mame It seems fine.
Just one thing, don't use pre-ANSI-style declaration.

@HParker HParker force-pushed the add-free-everything-flag branch 2 times, most recently from 7a3c312 to 11cee78 Compare November 8, 2023 16:21
vm.c Outdated
rb_objspace_free_objects(objspace);
rb_free_generic_iv_tbl_();
if (th) {
xfree(nt);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on Mac, this fails with,

pointer being freed was not allocated

but I am confused since it seems like,

vm->ractor.main_thread

is always ZALLOCed in Init_BareVM.

@HParker HParker force-pushed the add-free-everything-flag branch 2 times, most recently from 9c642a9 to 7b8d4ee Compare November 8, 2023 17:34
@HParker HParker force-pushed the add-free-everything-flag branch from 7b8d4ee to 6b8df8e Compare November 8, 2023 19:23
@HParker HParker force-pushed the add-free-everything-flag branch 5 times, most recently from 8c3eb0f to bbded12 Compare November 9, 2023 20:09
@HParker
Copy link
Contributor Author

HParker commented Nov 9, 2023

I tested this branch with a Rails app, and I get a double free from freeing environ,

==76982== Invalid free() / delete / delete[] / realloc()
==76982==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==76982==    by 0x4E3BD7B: tdestroy_recurse (tsearch.c:767)
==76982==    by 0x4E3BD7B: tdestroy_recurse (tsearch.c:766)
==76982==    by 0x4E3BD7B: tdestroy_recurse (tsearch.c:764)
==76982==    by 0x4E3BD7B: tdestroy_recurse (tsearch.c:764)
==76982==    by 0x4E3BD7B: tdestroy_recurse (tsearch.c:766)
==76982==    by 0x4E3BD7B: tdestroy (tsearch.c:780)
==76982==    by 0x4EB98B5: free_mem (in /usr/lib/x86_64-linux-gnu/libc-2.31.so)
==76982==    by 0x4EB9B41: __libc_freeres (in /usr/lib/x86_64-linux-gnu/libc-2.31.so)
==76982==    by 0x48331C6: _vgnU_freeres (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_core-amd64-linux.so)
==76982==    by 0x4D679C1: __run_exit_handlers (exit.c:132)
==76982==    by 0x4D67A5F: exit (exit.c:139)
==76982==    by 0x4D45089: (below main) (libc-start.c:342)
==76982==  Address 0x303435b0 is 0 bytes inside a block of size 36 free'd
==76982==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==76982==    by 0x462E3B: rb_free_environ (setproctitle.c:146)
==76982==    by 0x35AD16: ruby_vm_destruct (vm.c:2995)
==76982==    by 0x15AFDE: rb_ec_cleanup (eval.c:268)
==76982==    by 0x15B868: ruby_run_node (eval.c:328)
==76982==    by 0x156206: rb_main (main.c:39)
==76982==    by 0x156206: main (main.c:58)
==76982==  Block was alloc'd at
==76982==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==76982==    by 0x4D67368: __add_to_environ (setenv.c:215)
==76982==    by 0x484459F: setenv (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==76982==    by 0x19AE5F: ruby_setenv (hash.c:5134)
==76982==    by 0x19B19C: env_aset (hash.c:5321)
==76982==    by 0x19B19C: env_aset_m (hash.c:5302)
==76982==    by 0x33E5AA: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3504)
==76982==    by 0x351682: vm_sendish (vm_insnhelper.c:5582)
==76982==    by 0x351682: vm_exec_core (insns.def:822)
==76982==    by 0x34269D: rb_vm_exec (vm.c:2481)
==76982==    by 0x345F89: invoke_block (vm.c:1499)
==76982==    by 0x345F89: invoke_iseq_block_from_c (vm.c:1569)
==76982==    by 0x345F89: invoke_block_from_c_bh (vm.c:1587)
==76982==    by 0x34694F: vm_yield_with_cref (vm.c:1624)
==76982==    by 0x34694F: vm_yield (vm.c:1632)
==76982==    by 0x34694F: rb_yield_0 (vm_eval.c:1362)
==76982==    by 0x34694F: rb_yield (vm_eval.c:1378)
==76982==    by 0x46BBBB: rb_ary_each (array.c:2532)
==76982==    by 0x33E5AA: vm_call_cfunc_with_frame_ (vm_insnhelper.c:3504)

I am going to stop freeing environ which means it will show up as a leak, but it allows testing with less trivial Ruby scripts. I would love a better solution, but I am not sure how to determine when/if something in environ is "owned" by the vm.

@HParker HParker force-pushed the add-free-everything-flag branch from bbded12 to 3f3f4b3 Compare November 9, 2023 23:41
@Edwing123
Copy link
Contributor

I would've liked to see FREE_ON_SHUTDOWN being prefixed with RB_. So it's easy to see that is related to Ruby.

@Edwing123
Copy link
Contributor

I regret saying RB_, let's better use RUBY_. what do you think? @peterzhu2118.

@HParker HParker force-pushed the add-free-everything-flag branch from 3f3f4b3 to 83700be Compare November 13, 2023 04:40
@HParker
Copy link
Contributor Author

HParker commented Nov 13, 2023

I looked around the code and found more examples of `getenv("RUBY_*" so I chose that as the prefix. Happy to change it, but that seems like the most consistent name.

Thanks for the suggestion @Edwing123

st_free_table(vm->static_ext_inits);
st_free_table(vm->ensure_rollback_table);

ruby_xfree(vm->postponed_job_buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working on another issue with the postponed job infrastructure, I realised there's something a little bit annoying about doing this... It's the most miniscule of corner cases, but I figure I'll bring it up.

If some other gem has registered a signal handler, it's possible a signal might arrive after this and before the process exits, the gem's signal signal handler could then try and call rb_register_postponed_job, which will be a use-after-free here.

Stackprof in particular tries to get this right, and almost does: it registers a ruby_vm_at_exit hook here: https://github.com/tmm1/stackprof/blob/91d12e77b3d3c5014870b4ee7c19eab24ad04b5b/ext/stackprof/stackprof.c#L903 to disable itself.

There are just a couple teeny tiny problems though:

The first problem is, that the free-on-shutdown stuff is happening before ruby_vm_run_at_exit_hooks. Easy fix, let's hoist that further up I suppose? Or push this stuff further down?

The second problem is what happens in the presence of non-ruby-managed threads. If there are other native threads in the process, a signal handler could be in-progress on that thread already when ruby_vm_run_at_exit_hooks is called, so setting ruby_vm_running = 0; wouldn't stop it from happening.

I actually can't think of anything at all we can do about that though. If there was a way to run something at exit after all other threads have been destroyed, we could do that, but there isn't such a thing. I guess well-behaved extensions need to make sure they pthread_join all their threads in a ruby_vm_at_exit handler?

when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
@HParker HParker force-pushed the add-free-everything-flag branch from 83700be to 16acb7c Compare December 7, 2023 18:18
@HParker HParker changed the title Free everything at shutdown Free everything at exit Dec 7, 2023
@peterzhu2118 peterzhu2118 merged commit 6816e8e into ruby:master Dec 7, 2023
@mame
Copy link
Member

mame commented Dec 20, 2023

@HParker @peterzhu2118 @byroot Matz suggested RUBY_FREE_AT_EXIT instead of ON_EXIT because there is Kernel#at_exit. Can we fix it?

@HParker
Copy link
Contributor Author

HParker commented Dec 20, 2023

@mame oops! 🤦 sorry! I even named the PR correctly, but must have got mixed up on the name in the PR. I opened this to fix it: #9302

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants