Skip to content

fincore: add recursive directory scanning #3529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

teknoraver
Copy link
Contributor

@teknoraver teknoraver commented Apr 15, 2025

Add a --recursive flag to fincore which allows to recursively scan directories.

@karelzak
Copy link
Collaborator

How portable are the fts_*() functions? It will probably require some modifications to configure.ac and meson.build, along with some #ifdefs.

Would it be more portable to use nftw(), which is POSIX compatible? (Yes, with fts it seems more elegant, just one while()).

@teknoraver
Copy link
Contributor Author

the manpage says "These functions are available in Linux since glibc2"

@teknoraver
Copy link
Contributor Author

@karelzak I looked at nftw(). It's a cool API, but it has a big disadvantage, there isn't any way to pass an argument to the handler, and we need both fincore_control and fincore_state to pass.
Sure we could use a global variable, but it's not an approach I endorse usually.

@karelzak
Copy link
Collaborator

@karelzak I looked at nftw(). It's a cool API, but it has a big disadvantage, there isn't any way to pass an argument to the handler, and we need both fincore_control and fincore_state to pass. Sure we could use a global variable, but it's not an approach I endorse usually.

Using fincore_state as an argument for fincore_name() doesn't make sense since it's only used for the file name. It would be better to move fincore_state into the function and use just name as the argument. This would also reduce the use of switch() in main(). See the patch below.

Anyway, I agree that FTS looks better (no callbacks). We can try to use it since fincore is a Linux-only tool, but it will require checking for fts_openinconfigure.acandmeson.buildIt's also an opportunity to encourage the use of the FTS interface., adding #ifdef HAVE_FTS_OPENto the code, and printingerrx()if not supported and--recursive` is requested. It's also an opportunity to encourage the use of the FTS interface :-)

From fded72bd8269206ad4208e9a2372898a35070eea Mon Sep 17 00:00:00 2001
From: Karel Zak <kzak@redhat.com>
Date: Wed, 16 Apr 2025 09:40:46 +0200
Subject: [PATCH] fincore: fincore_state refactoring
 
---
 misc-utils/fincore.c | 34 ++++++++++++----------------------
 1 file changed, 12 insertions(+), 22 deletions(-)
 
diff --git a/misc-utils/fincore.c b/misc-utils/fincore.c
index a2f240052..7a4b5185b 100644
--- a/misc-utils/fincore.c
+++ b/misc-utils/fincore.c
@@ -356,12 +356,12 @@ static int fincore_fd (struct fincore_control *ctl,
 /*
  * Returns: <0 on error, 0 success, 1 ignore.
  */
-static int fincore_name(struct fincore_control *ctl,
-                       struct fincore_state *st)
+static int fincore_name(struct fincore_control *ctl, const char *name)
 {
        int fd;
        int rc = 0;
        struct stat sb;
+       struct fincore_state _st = { .name = name }, *st = &_st;
 
        if ((fd = open (st->name, O_RDONLY)) < 0) {
                warn(_("failed to open: %s"), st->name);
@@ -381,15 +381,15 @@ static int fincore_name(struct fincore_control *ctl,
                        warn(_("failed ioctl to get size: %s"), st->name);
        } else if (S_ISREG(sb.st_mode)) {
                st->file_size = sb.st_size;
-       } else {
-               rc = 1;                 /* ignore things like symlinks
-                                        * and directories*/
        }
 
        if (!rc)
                rc = fincore_fd(ctl, fd, st);
 
-       close (fd);
+       close(fd);
+
+       if (!rc)
+               rc = add_output_data(ctl, st);
        return rc;
 }
 
@@ -538,24 +538,14 @@ int main(int argc, char ** argv)
        }
 
        for(; optind < argc; optind++) {
-               struct fincore_state st = {
-                       .name = argv[optind],
-               };
-
-               switch (fincore_name(&ctl, &st)) {
-               case 0:
-                       add_output_data(&ctl, &st);
+               rc = fincore_name(&ctl, argv[optind]);
+               if (rc)
                        break;
-               case 1:
-                       break; /* ignore */
-               default:
-                       rc = EXIT_FAILURE;
-                       break;
-               }
        }
 
-       scols_print_table(ctl.tb);
-       scols_unref_table(ctl.tb);
+       if (rc == 0)
+               scols_print_table(ctl.tb);
 
-       return rc;
+       scols_unref_table(ctl.tb);
+       return rc ? EXIT_FAILURE :  EXIT_SUCCESS;
 }
-- 
2.49.0

@t-8ch
Copy link
Member

t-8ch commented Apr 16, 2025

Anyway, I agree that FTS looks better (no callbacks).

Unfortunately musl does not implement FTS.

@karelzak
Copy link
Collaborator

Anyway, I agree that FTS looks better (no callbacks).

Unfortunately musl does not implement FTS.

Yes, but if we all ignore FTS, it will never be adopted by other libcs. :-)

I have no strong opinion about it. Both ways (nftw with global ctl strcut; or FTS with errx on poor libcs) work for me.

@t-8ch
Copy link
Member

t-8ch commented Apr 16, 2025

Yes, but if we all ignore FTS, it will never be adopted by other libcs. :-)

For musl to adopt it I guess it would need to make it into POSIX first.
OTOH there are standalone implementations of FTS which can be used with musl.
If this is the util-linux policy, it's fine for me. I also have a WIP branch somewhere which got blocked on this.

@teknoraver
Copy link
Contributor Author

@karelzak I like it, if you merge it I will rebase and fix my PR

@teknoraver teknoraver force-pushed the fincore_mu branch 2 times, most recently from d36ba92 to 4b3976c Compare April 18, 2025 16:08
@masatake
Copy link
Member

bash-completion/fincore and misc-utils/fincore.1.adoc should be updated.

@teknoraver teknoraver force-pushed the fincore_mu branch 2 times, most recently from 87dc1a0 to bc197b5 Compare April 19, 2025 01:19
@teknoraver
Copy link
Contributor Author

@karelzak I had to refactor your change because it broke the test by changing the behavour in a substantial way.
On file error, fincore prints a warning, handles the other files, and then exit with error.
With that change it stopped on first unreadable or non existing file.

@teknoraver teknoraver marked this pull request as ready for review April 19, 2025 01:37
@teknoraver teknoraver force-pushed the fincore_mu branch 2 times, most recently from 3fe1614 to 00afda6 Compare April 22, 2025 11:28
Copy link
Member

@t-8ch t-8ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests would also be great.

@teknoraver teknoraver force-pushed the fincore_mu branch 2 times, most recently from 7161fa6 to 6b3c84c Compare April 22, 2025 23:38
@teknoraver
Copy link
Contributor Author

@t-8ch the problem with the recursive test is that the scan order is not consistent among runs, making the output diff unreliable

@teknoraver teknoraver force-pushed the fincore_mu branch 4 times, most recently from 23c6c0b to a5ed630 Compare April 23, 2025 10:45
@karelzak
Copy link
Collaborator

It looks good (after the meson update). @t-8ch, is there any objection?

@t-8ch
Copy link
Member

t-8ch commented Apr 24, 2025

@teknoraver

the problem with the recursive test is that the scan order is not consistent

You can run the output through sort --key.

@t-8ch
Copy link
Member

t-8ch commented Apr 24, 2025

@karelzak

is there any objection?

I'd really like to see tests.
Also I am wondering why an explicit cli switch is necessary. Wouldn't it be fine to switch to recursive mode if a directory is passed automatically?

@teknoraver
Copy link
Contributor Author

@t-8ch the tests are there, but unfortunately an unrelated one (findmnt/listmount) started failing and the CI is red

@teknoraver
Copy link
Contributor Author

@karelzak I'm pretty sure that the test won't pass on a system with 16k or 64k page size, but I don't have such machine to test on

@karelzak
Copy link
Collaborator

Wouldn't it be fine to switch to recursive mode if a directory is passed automatically?

Ah, yes, that's a good point.

I thought about that too when @teknoraver sent the first version, but I forgot to discuss it.

The mincore() and cachestat() functions are restricted to mapped files (such as regular files or /dev/mem,
etc.), enabling them for directories by default should be safe.

@teknoraver
Copy link
Contributor Author

fts_open works with regular files as well.
We can drop -R and always use fts_read to scan argv[]

@t-8ch
Copy link
Member

t-8ch commented Apr 24, 2025

I'm pretty sure that the test won't pass on a system with 16k or 64k page size, but I don't have such machine to test on

@glaubitz You tend to have access to a lot of exotic machines. Could you run the tests on a 64k pagesize machine?

@glaubitz
Copy link
Contributor

glaubitz commented Apr 24, 2025

I'm pretty sure that the test won't pass on a system with 16k or 64k page size, but I don't have such machine to test on

@glaubitz You tend to have access to a lot of exotic machines. Could you run the tests on a 64k pagesize machine?

Yeah, I can boot a ppc64 machine into a kernel with 64k pages enabled. Debian provides kernel packages with 64k pagesize for both big- and little-endian ppc64.

Can you tell me what repo and branch to test so I don't have to dig through the message list?

@teknoraver
Copy link
Contributor Author

@glaubitz hi!

The repo is:
https://github.com/teknoraver/util-linux.git

and the branch fincore_mu

I've fixed the test blindly, let me know if the pass on a 64k page size machine

@glaubitz
Copy link
Contributor

@glaubitz hi!

The repo is: https://github.com/teknoraver/util-linux.git

and the branch fincore_mu

I've fixed the test blindly, let me know if the pass on a 64k page size machine

The test fincore/count fails with the following diff:

(sid_ppc64-dchroot)glaubitz@perotto:~/util-linux-64k$ cat ./tests/diff/fincore/count 
--- /home/glaubitz/util-linux-64k/tests/output/fincore/count.aggregate  2025-04-25 07:15:05.073991097 +0000
+++ /home/glaubitz/util-linux-64k/tests/output/fincore/count    2025-04-25 07:15:05.841998281 +0000
@@ -53,13 +53,13 @@
 return value: 1
 0     0 i_dir/EMPTY_FILE
 return value: 0
-1  65535 i_dir/PAGESIZE_-1__incore_
+1 65535 i_dir/PAGESIZE_-1__incore_
 return value: 0
-1  65536 i_dir/JUST_PAGESIZE_incore_
+1 65536 i_dir/JUST_PAGESIZE_incore_
 return value: 0
 [ RECURSIVE SCAN ]
-    0      0 i_dir/EMPTY_FILE
-    1  65535 i_dir/PAGESIZE_-1__incore_
-    1  65536 i_dir/JUST_PAGESIZE_incore_
-PAGES   SIZE FILE
+    0     0 i_dir/EMPTY_FILE
+    1 65535 i_dir/PAGESIZE_-1__incore_
+    1 65536 i_dir/JUST_PAGESIZE_incore_
+PAGES  SIZE FILE
 return value: 0
(sid_ppc64-dchroot)glaubitz@perotto:~/util-linux-64k$

@masatake
Copy link
Member

@teknoraver I guess --raw option may help you.

@teknoraver
Copy link
Contributor Author

@glaubitz so it's just a whitespace corruption.
I used --raw as suggested by @masatake, it should pass now

@glaubitz
Copy link
Contributor

@glaubitz so it's just a whitespace corruption. I used --raw as suggested by @masatake, it should pass now

There is still some diff left, unfortunately:

(sid_ppc64-dchroot)glaubitz@perotto:~/util-linux-64k$ cat ./tests/diff/fincore/count 
--- /home/glaubitz/util-linux-64k/tests/output/fincore/count.aggregate  2025-04-25 12:17:51.153632799 +0000
+++ /home/glaubitz/util-linux-64k/tests/output/fincore/count    2025-04-25 12:17:52.421647956 +0000
@@ -53,9 +53,9 @@
 return value: 1
 0     0 i_dir/EMPTY_FILE
 return value: 0
-1  65535 i_dir/PAGESIZE_-1__incore_
+1 65535 i_dir/PAGESIZE_-1__incore_
 return value: 0
-1  65536 i_dir/JUST_PAGESIZE_incore_
+1 65536 i_dir/JUST_PAGESIZE_incore_
 return value: 0
 [ RECURSIVE SCAN ]
 0 0 i_dir/EMPTY_FILE
(sid_ppc64-dchroot)glaubitz@perotto:~/util-linux-64k$

@masatake
Copy link
Member

How about appending sed?

ts_check_prog "sed"
...
... | sort | sed -e 's/ \{2,\}/ /g'

@teknoraver
Copy link
Contributor Author

@masatake I'd rather add --raw everywhere and make the test more stable.
I have an ARM Macbook here which is using 16k page table, and that will require more work and more padding calculation if we want to support it

@teknoraver
Copy link
Contributor Author

@glaubitz I removed all spaces from the expected files and used --raw.
It should work now.

@glaubitz
Copy link
Contributor

@glaubitz I removed all spaces from the expected files and used --raw. It should work now.

Yes, it works now:

---------------------------------------------------------------------
  All 340 tests PASSED
---------------------------------------------------------------------
``

@teknoraver
Copy link
Contributor Author

I have turned 16k pages on on my ARM machine, and I've extended the test to work with such size too:

teknoraver@fedorarm:~/src/util-linux$ uname -a
Linux fedorarm 6.14.2-401.asahi.fc42.aarch64+16k #1 SMP PREEMPT_DYNAMIC Wed Apr 16 12:06:15 UTC 2025 aarch64 GNU/Linux

teknoraver@fedorarm:~/src/util-linux$ ./test_sysinfo pagesize
16384

teknoraver@fedorarm:~/src/util-linux$ tests/run.sh --show-diff fincore/count

-------------------- util-linux regression tests --------------------

                    For development purpose only.
                 Don't execute on production system!

       kernel: 6.14.2-401.asahi.fc42.aarch64+16k

      options: --show-diff \
               --srcdir=/home/teknoraver/src/util-linux/tests/.. \
               --builddir=/home/teknoraver/src/util-linux/tests/..

      fincore: count file contents in core    ... OK

---------------------------------------------------------------------
  All 1 tests PASSED
---------------------------------------------------------------------

This is the default on Apple ARM machines running Linux

@t-8ch
Copy link
Member

t-8ch commented Apr 27, 2025

The test failure in the distcheck CI job looks genuine.

@teknoraver
Copy link
Contributor Author

I saw it but I can't understand:

/home/runner/work/util-linux/util-linux/util-linux-2.41/_build/sub/../../tests/ts/fincore/count: line 124: /home/runner/work/util-linux/util-linux/util-linux-2.41/_build/sub/tests/output/fincore/count.err.aggregate: Permission denied
I create output/fincore/count.err.aggregate as temporary file to compare the result, is this path read-only?
Other outputs are placed in output/fincore so it must be writable

@t-8ch
Copy link
Member

t-8ch commented Apr 28, 2025

Maybe replace cp "$TS_EXPECTED_ERR" "$AGG_ERR" with cat "$TS_EXPECTED_ERR" > "$AGG_ERR". But that is just a guess.

Add a --recursive flag to fincore which allows to recursively scan
directories.

Co-authored-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Matteo Croce <teknoraver@meta.com>
@teknoraver
Copy link
Contributor Author

Hmm qemu-user armv7 is the only one failing now.
That build supports fts_open(), but the output with --recursive is empty, and the return value is 0.

--- /home/runner/work/util-linux/util-linux/tests/output/fincore/count.aggregate	2025-04-28 22:41:38.297718592 +0000
+++ /home/runner/work/util-linux/util-linux/tests/output/fincore/count	2025-04-28 22:41:47.448702913 +0000
@@ -58,8 +58,4 @@
 1 4096 i_dir/JUST_PAGESIZE_incore_
 return value: 0
 [ RECURSIVE SCAN ]
-0 0 i_dir/EMPTY_FILE
-1 4095 i_dir/PAGESIZE_-1__incore_
-1 4096 i_dir/JUST_PAGESIZE_incore_
-PAGES SIZE FILE
 return value: 0

This is weird, how can I reproduce such test locally?

@masatake
Copy link
Member

masatake commented May 1, 2025

You don't have to insist on reproducing the issue locally.
It's painful, but can be debugged in a CI/CD environment; you can run any code from ts/fincore/count for debugging.

To shorten the debugging cycle, disable all CI/CD environments other than qemu-user armv7.
e.g. #3205

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants