Press "Enter" to skip to content

Curated SQL Posts

Time Series Feature Engineering in Pandas

Matthew Mayo knows that time is a flat circle:

Feature engineering is one of the most important steps when it comes to building effective machine learning models, and this is no less important when dealing with time-series data. By being able to create meaningful features from temporal data, you can unlock predictive power that is unavailable when applied to raw timestamps alone.

Fortunately for us all, Pandas offers a powerful and flexible set of operations for manipulating and creating time-series features.

Click through for seven things you can do in Pandas to extend or work with time series data.

Leave a Comment

Tips for Troubleshooting PostgreSQL Performance Slowdowns

Umair Shahid shares a few tips with us:

If you are a technology leader overseeing a team of developers who manage PostgreSQL as part of a broader application stack, or you are responsible for uptime and customer satisfaction at scale, knowing where to look first can make all the difference.

Let us walk through a focused checklist of patterns and places that commonly hold the key to unlocking better PostgreSQL performance.

This is a very high-level set of reminders regarding where you should look, rather than a detailed troubleshooting guide. But sometimes, it’s good to have that reminder.

Leave a Comment

Digging Further into RTABench Q0 Optimization on PostgreSQL

Andrei Lepikhov responds to feedback:

In the previous post, I explored some nuances of Postgres related to indexes and parallel workers. This text sparked a lively discussion on LinkedIn, during which one commentator (thanks to Ants Aasma) proposed an index that was significantly more efficient than those discussed in the article. However, an automated comparison of EXPLAINs did not clarify the reasons for its superiority, necessitating further investigation.

Click through for the index and what Andrei learned along the way.

Leave a Comment

Tips for Working with Real-Time Analytics in Microsoft Fabric

Reitse Eskens shares some tips:

When discussing options, possibilities, and solutions with customers, the Real-Time stack began to emerge. We received questions on ingestion that couldn’t be simply answered using batch processing. The best part is that we can start learning new technology!

The following blog will outline the best things I learned working with real-time analytics.

Click through for those items.

Leave a Comment

Moving Items around in Power BI

Reza Rad sends this item to the back of the line:

When you have multiple items overlapping, you often need the feature to bring one forward or move it backward. In Power BI, this feature isn’t available by right-click. Instead, there is a Selection pane where you can easily set the order of elements. The selection pane also has other benefits. In this article and video, you will learn how to use the Selection pane to build the right order for your visuals.

Read on to see how it works.

Leave a Comment

Choosing the Right Logging Level in SSIS

Andy Brownsword does a bit of logging:

When creating SQL Agent jobs to execute SSIS packages we can choose the level of logging to be captured. Different settings are more beneficial under the right circumstances so it’s important to understand the differences to make the right decision.

These settings control the internal logging done by SSIS. This is out of the box and freely available, so why not use it effectively.

The real trick is that if you swallow all of the exceptions and errors, everybody will just assume your code is working perfectly and boom, problem solved. Or you could read Andy’s post and get actual information. Whatever works for you, I suppose.

Leave a Comment

Making Leading Wildcard Searches Faster

Brent Ozar flips everything around:

99.9% of you are never gonna need this.

But let’s say you need to run queries with leading (not trailing) wildcards, like this search for all the different national versions of the Encabulator, each of which has different prefixes depending on which government it’s being sold to:

This is indeed a pretty uncommon scenario. I’m pretty sure I’ve only ever needed to do this once. Well, twice, but in one case I couldn’t actually use the REVERSE() function because the column was itself an awful non-deterministic function and this solution wouldn’t work.

Leave a Comment

SQL Server Regular Expressions with Multiple Matches

Louis Davidson has popped and therefore cannot stop:

The goal of this week’s entry is specifically to show how to see how multiple matches can be viewed using SQL Server’s RegEx, specifically to make the examples clearer (especially in the upcoming entries).

There are several functions that you can use where multiple matches are used as part of the output:

Click through for that list and several examples of relevant functions in action.

Leave a Comment

Sundry Causes of Slow Disk Performance

Kevin Hill thinks about I/O:

“SQL Server is slow.”

We’ve all heard it. But that doesn’t always mean SQL Server is the problem. And “slow” means nothing without context and ability to verify.

More often than you’d think, poor performance is rooted in the one thing most sysadmins don’t touch until it’s on fire: the disk subsystem.

There are other potential causes as well, such as choosing the wrong RAID array format (like, say, RAID 6 for your extremely busy log files) and limited bandwidth to a SAN. Note that Kevin’s listings for what constitutes acceptable disk focuses primarily on on-premises solutions, maybe biased toward direct attached storage versus a SAN. For cloud databases, spikes of 30-60 seconds are perfectly fine, of course.

Leave a Comment