Skip to content

add \t filling option #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 8, 2025
Merged

add \t filling option #46

merged 24 commits into from
Apr 8, 2025

Conversation

emar-kar
Copy link
Contributor

@emar-kar emar-kar commented Mar 30, 2025

This MR is the base to follow the behavior of GNU ls to fill spaces with \t. After some time and investigation, it looks like 'efficiency' of using \t in GNU is due to filling the emptiness between columns by one character on every iteration. But in rust version, optimizations led to filling the whole padding with spaces in just one call.

This change is required to fix several issues in uu_ls:

7526
6897

And at the same time do not break 5396, where it was explicitly requested to fill spaces with \t.

In addition, with this update, it will be possible to use -T option, see MR7616 for details.

@sylvestre sylvestre requested a review from Copilot March 30, 2025 10:59
Copilot

This comment was marked as outdated.

@emar-kar
Copy link
Contributor Author

There is still one more thing I'm working on. Playing with tabs and spaces I noticed that if the last line of a grid does not reach the end of the grid, the separator is still being printed. Since the number of columns is defined in Dimenstions.widths. Is it reasonable to fix the output to omit printing the separator in this case?

@tertsdiepraam
Copy link
Member

Is it reasonable to fix the output to omit printing the separator in this case?

I'd say that is reasonable yes!

Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! And thanks for adding tests and good comments too. I have some suggestions to improve this a bit more. Let me know what you think!

src/lib.rs Outdated
}

// Last in row checks only the predifined grid width.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Last in row checks only the predifined grid width.
// Last in row checks only the predefined grid width.

src/lib.rs Outdated
if !last_in_row {
if padding_size > 0 {
f.write_str(&padding[0..padding_size])?;
// In case if this entry was the last on the current line,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// In case if this entry was the last on the current line,
// In case this entry was the last on the current line,

return Ok(());
}

let (tab_size, separator) = match &self.options.filling {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether using Option<usize> (or maybe Option<NonZeroUsize>) might be nicer here, because 0 has a special meaning here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got you right, you want to do something like:

let (tab_size, separator) = match &self.options.filling {
    Filling::Spaces(n) => (None, " ".repeat(*n)),
    Filling::Text(s) => (None, s.clone()),
    Filling::Tabs { spaces, tab_size } => (Some(*tab_size), " ".repeat(*spaces)),
};

// ...

match tab_size {
    Some(tab_size) => {}, // Calculate tabs.
    None => {}, // Print spaces.
}

If that's what you mean, I'll disagree in this design. With this change I can do something stupid as pass Some(0) as a tab_size and instead of simply filling everything with spaces, it will still try to calculate the tabs size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is indeed what I would suggest, but the 0 tab size indeed should not be possible, so it would be:

Filling::Tabs { spaces, tabsize } => (NonZeroUsize::new(*tab_size), " ".repeat(*spaces)),

It would functionally be very similar to what you have, but it would make it very clear that None is a very different case than positive numbers. Feel free to leave it like this though. It's probably clear enough.

See https://doc.rust-lang.org/stable/std/num/type.NonZeroUsize.html for more info on that type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the link, didn't think about that before. I tried this solution, but got an error in calculations of the tabs cannot subtract usize from NonZero<usize>. Even tho I like the suggestion, but is it reasonable to add more types conversions to handle this update?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! You can just do a get after matching on the option to get a usize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've read about it. But my question is mostly how justified is it, to add new type, just to replace if-else on this:

match tab_size {
    Some(tab_size) => {
        let tab_size = tab_size.get();
        // ...
        }
    None => {
        // ...
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could write that with an if let if you prefer:

if let Some(tab_size) = tab_size {

} else {

}

But I do see your point. It's up to you!

src/lib.rs Outdated
f.write_str(&padding[0..padding_size])?;
// In case if this entry was the last on the current line,
// there is no need to check for separator or padding.
if !last_in_row && num + next < self.cells.len() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to flip this condition and make it an early continue instead.

src/lib.rs Outdated
if padding_size > 0 {
f.write_str(&padding[0..padding_size])?;
// In case if this entry was the last on the current line,
// there is no need to check for separator or padding.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean "to print the separator and padding"?

src/lib.rs Outdated
f.write_str(&separator)?;
} else {
// Move cursor to the end of the current contents.
cursor += width;
Copy link
Member

@tertsdiepraam tertsdiepraam Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna try to simplify the maths here a little bit, because I find the condition of the while loop a bit hard to understand to be honest.

I'm trying to take advantage of the fact that only the first tab will be misaligned. I haven't tested it, but hopefully it can inspire you to simplify the code here a bit.

// Move cursor to the end of the current contents
cursor += width;

// Total number of spaces we need to print
let total_spaces = padding_size + DEFAULT_SEPARATOR_SIZE;

// Size of the first tab relative to the cursor.
let first_tab = tab_size - (cursor % tab_size);

// We only print tabs if at least the first tab fits within the spaces we print
if first_tab >= total_spaces {
    let rest_spaces = total_spaces - first_tab;
    let tabs = 1 + (rest_spaces / tab_size);
    let spaces = rest_spaces % tab_size;
    todo!("print that number of tabs and spaces");
} else {
    todo!("print only spaces");
}

cursor += total_spaces;

- [`filling`][filling]: what to put in between two columns — either a number of
spaces, or a text string;
- [`filling`][filling]: how to fill empty space between columns:
- [`Filling::Spaces`][Spaces] number of spaces between columns;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make the tabs option a part of the spaces option maybe? What do you think? Something like

enum Filling {
    Whitespace {
        spaces: usize,
        tab_size: Option<usize>,
    }
    Text(/* ... */),
}

What you got works pretty well too, but maybe it would be nice to also configure the separator size for a tab layout. I'd accept this too though!

Copy link
Contributor Author

@emar-kar emar-kar Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying not to break the API, but introduce a new thing. Maybe we can add separator size in new Tab instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I like your suggestion!

Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress here! This looks almost done, just some small suggestions. I'll leave the choice for NonZeroUsize to you. Don't feel obliged to implement it.

You can ping me when you feel it's ready.

@@ -13,6 +13,12 @@
use ansi_width::ansi_width;
use std::fmt;

/// Number of spaces in one \t.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor improvement is to put the character between backticks to make it nicer in rustdoc: `\t`

This only goes for the doc comments of course.

return Ok(());
}

let (tab_size, separator) = match &self.options.filling {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could write that with an if let if you prefer:

if let Some(tab_size) = tab_size {

} else {

}

But I do see your point. It's up to you!

src/lib.rs Outdated
let num = match self.options.direction {
Direction::LeftToRight => y * self.dimensions.widths.len() + x,
Direction::TopToBottom => y + self.dimensions.num_lines * x,
let (num, next) = match self.options.direction {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So num here was already not very descriptive (which wasn't your fault obviously), but with next it even more unclear. Is it possible to make these names a bit more descriptive? Currently, it's not super clear that num is an absolute value, but next is an offset.

Maybe some of the explanation above last_in_row could go here.

@emar-kar
Copy link
Contributor Author

emar-kar commented Apr 8, 2025

@tertsdiepraam I've updated variables naming and add some clarifications in the comments. Can you have a look please?

Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Let's see it in action in ls!

@tertsdiepraam tertsdiepraam merged commit c8db34b into uutils:main Apr 8, 2025
7 checks passed
Copy link

codecov bot commented Apr 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (de183c9) to head (12105f9).
Report is 6 commits behind head on main.

Additional details and impacted files
@@    Coverage Diff     @@
##   main   #46   +/-   ##
==========================
==========================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants