Advent of Code: Day 4

Reading Time: 3 minutes

It’s day 4 of the Advent Of Code!

Yesterday wasn’t a good day for the set-based solution series, but today turned out to be MUCH better. In fact, I managed to finish both problems within 16 minutes of their release, and, I was one of the top 800 quickest respondents on each part of the question.

The Challenge

The challenge today was to; first split a string into a series of “words”, and then check whether or not there were duplicates in the string. A string with duplicates is viewed as invalid.

In part one, we had to count the number of valid strings in our input.

The solution (download)

As I mentioned on day 2, I have a utility function which is designed for splitting strings. So I used that function to split each string into a series of words.

DECLARE @xml as xml

SET @xml = cast(('<X>'+replace(@str,@delimiter ,'</X><X>')+'</X>') as xml)

SELECT N.value('.', 'varchar(10)') as value 
FROM @xml.nodes('X') as T(N)

After that, it was a simple grouping query, filtered by comparing the total number of words in a row count(words) with the distinct number of words in a row count(distinct words).

select a.RowID
from @t a
cross apply [Utility].[SplitListXML](a,' ')
group by a.RowID
having count(item) = count(distinct Item)

So, 3 minutes and 28 seconds after the challenge started, I had a solution for part one, and I was ready for part 2. I was also quite optimistic that it would fit the first part of the solution better than yesterday.

And I was right to be optimistic.

The second part of the challenge required that words which were anagrams of each other also “invalidate” the entire string.

I already had the words split out, now I just had to find a way to compare anagrams… So I used my tally table and the substring function to split the words into individual letters, similarly to the technique I used on day 1.

select a.RowID as RowNumber,
       b.RowID as WordNumber,
       t.Number as LIndex,
       substring(b.Item,Number,1) as Item
from @t a
cross apply Utility.Utility.SplitListXML(a,' ') b
inner join Utility.dbo.TallyTable t
on Number between 1 and len(b.Item)

This gives me an individual table of the letters in the words. To check for anagrams, I decided to reform each word but with the letters in alphabetical order.

Sorting the letters is easy. Reforming the words… that took me a minute or two to figure out.

And then I remembered. The technique of using FOR XML PATH to split a string works just as well to CONCATENATE a string.

You can actually play around with adding delimiters and all sorts (see this blog post), but for this question, we just had to make a concatenate a bunch of letters with no frills.

SELECT o.RowNumber,
       o.WordNumber,
       (
           SELECT Item + '' AS [text()]
           FROM @tmp [i]
           where [i].RowNumber = [o].[RowNumber]
           and [i].WordNumber = [o].[WordNumber]
           order by Item
           FOR XML PATH('')
        ) as newword
from @tmp [o]
group by o.RowNumber,
         o.WordNumber

The important part of this solution is the order by clause inside the subquery. This ensures that our output is in beautifully alphabetised .

Once I have a table of newly reformed words, I can use the exact same query as in part one, and that’s it, another day done!

Once again, thanks to @ericwastl for the concept and all the work that goes into the challenges. Really enjoying it!

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *