Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The best way to initialize an existing column? #13

Closed
radanskoric opened this issue Jun 16, 2024 · 16 comments · Fixed by #20
Closed

The best way to initialize an existing column? #13

radanskoric opened this issue Jun 16, 2024 · 16 comments · Fixed by #20
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@radanskoric
Copy link

What is the best way to initialize a positioning column when it's being added to a table that already has rows in it?

With existing data in a table we need to do the little dance of adding the column allowing null, then backfilling existing rows and then adding the not null constraint and unique index. But it's not clear how to do the backfill.

Since the interface assumes we don't fill the columns directly it suggests we should load each record in turn and update it to something like last position? But that would be pretty inefficient for any table with non trivial amount of rows in it.

Any recommendations here?

@brendon
Copy link
Owner

brendon commented Jun 16, 2024

Hi @radanskoric, that's a very good question and one I haven't dealt with explicitly yet. I hadn't thought about the constraints getting in the way. I wonder if it could be a case for a custom migration method that does this dance transparently? Something a bit like t.references. There are a few gems that have their own migration methods in a similar vein but I can't think of any at the moment.

Would you feel up to the task of looking at a PR for this?

@radanskoric
Copy link
Author

I can do that if you point me at the relevant parts of the codebase.

However, I'm wondering what the correct mechanism should be?

I tried the following:

class AddPositioningColumn < ActiveRecord::Migration[7.1]
  def change
    add_column :tickets, :position, :integer

    Ticket.find_each do |ticket|
      ticket.update! position: :last
    end

    change_column_null :tickets, :position, false
    add_index :tickets, :position, unique: true
  end
end

But that didn't work. All records ended up with position == 0 and then adding the unique index failed.

I traced the issue down to Positioning::Mechanisms#last_position:

def last_position
    (positioning_scope.maximum(@column) || 0) + (in_positioning_scope? ? 0 : 1)
end

The issue seems to be that the code assumes all persisted rows are set, it can't deal with partially initialised rows. in_positioning_scope? is always true since all records are saved and then it's always adding 0 so all rows get position 0.

But looking at the code I guess it would work just fine if I bypassed positioning methods and initialised all values to 0,1,2,3, ..., correct?

@brendon
Copy link
Owner

brendon commented Jun 18, 2024

Yes you're right, it's probably better to skip all the callbacks and validations and set the values directly. The callback functionality assumes an established list as you mention.

I wish I could remember where I saw the custom migration method in a gem. I can't think of it for the life of me :) Ah, just remembered: https://github.com/thoughtbot/paperclip?tab=readme-ov-file#migrations

An old library but probably still applies.

https://github.com/thoughtbot/paperclip/blob/c769382c9b7078f3d1620b50ec2a70e91ba62ec4/lib/paperclip/schema.rb#L19

If you feel up to this, give it a go. For sure the way Paperclip did it is probably well out of date but it's worth a shot. I wonder if ActiveRecord has a hook for adding these methods. https://github.com/rails/rails/blob/main/activerecord/lib/active_record/connection_adapters/abstract/schema_statements.rb

@radanskoric
Copy link
Author

That should definitely be doable, the tricky part is making it work well cross databases.

Most efficient and probably cleanest way is with a single update statement using a window function. But syntax differs slightly between databases. An example, in SQLite would be :

UPDATE tickets
SET position = new_position
FROM
  (SELECT id,
          rank() OVER sorted AS new_position
          FROM tickets 
          WINDOW sorted AS (
                 ORDER BY id
          )
   ) AS positioned
WHERE tickets.id = positioned.id;

The neat part about that is that the inner query (that's here ORDER BY id) could be also scoped to categories in case of a categorised positioned setup and it will again work as a single query. So it's clear how to do the custom back fill migration as one SQL command, most of the work is covering the various cases: 3 DBs, various positioned options.

I'll be honest, I would like to implement this but don't have time in the near future, for now I know how to unblock myself with a custom migration. I'm leaving this message here just so I don't forget when I, hopefully, get back to it. Or, if someone else comes along, maybe they find this useful.

@brendon
Copy link
Owner

brendon commented Jun 18, 2024

Thanks @radanskoric. I guess the downside of being efficient is that you need to implement it three times. The alternative would be to just fire off individual updates for each row. It would depend on what size tables we're taking about :D Still it's hopefully a one time thing.

Happy to leave this here. I might find time to have a play with it too :)

@brendon brendon added enhancement New feature or request help wanted Extra attention is needed labels Jun 18, 2024
@radanskoric
Copy link
Author

Thanks @radanskoric. I guess the downside of being efficient is that you need to implement it three times. The alternative would be to just fire off individual updates for each row. It would depend on what size tables we're taking about :D Still it's hopefully a one time thing.

Happy to leave this here. I might find time to have a play with it too :)

You make a good point that it's about speed. One way could be to implement the slow but trivially cross functional way of updating one row at a time and then just print a warning that this will be slow on a large database and to take care to implement a database specific backfill if needed.

@brendon
Copy link
Owner

brendon commented Jun 18, 2024

That's a good idea. Keeps things nice and simple :)

@qinmingyuan
Copy link
Contributor

qinmingyuan commented Jul 24, 2024

I have defined a active record class method to reset positions

  def RecordClass.reset_positions!
      positioning_columns.each do |_, columns|
        select(*columns).distinct.as_json(only: columns).each do |filter|
          where(filter).find_each.with_index do |item, index|
            item.update_columns position: index + 1
          end
        end
      end
  end

@radanskoric
Copy link
Author

@qinmingyuan nice, this is pretty much the one row at a time approach that we discussed above. 🙌 Unfortunately, the issue that lead me to writing here is no longer a priority for me and I haven't been able to set aside time to implement a PR.

Would you maybe like to take your implementation and make a PR for this issue here?

@qinmingyuan
Copy link
Contributor

Ok, I would like to do it on my weekend.

@brendon
Copy link
Owner

brendon commented Aug 21, 2024

@radanskoric and @qinmingyuan, I've committed a solution to main. Let me know how you find it :)

I'll release a new version in a minute.

@radanskoric
Copy link
Author

@radanskoric and @qinmingyuan, I've committed a solution to main. Let me know how you find it :)

I'll release a new version in a minute.

Nice, thank you both! :)

@brendon
Copy link
Owner

brendon commented Aug 21, 2024

You're most welcome @radanskoric. And in the weirdest coincidence, I just came across your article about Turbo 8 in my attempt to figure out why broadcasting page refreshes from the model side wasn't updating the page. Turns out it's to do with the request-id on the turbo-stream tag preventing the initiator of the request from honouring the refresh because it's assumed a full page render will be received anyway. It's super subtle and not really documented. Thanks for taking the time to publish your findings! :)

https://radanskoric.com/articles/turbo-morphing-deep-dive

@brendon
Copy link
Owner

brendon commented Aug 21, 2024

Out of interest. I noticed it's quite hard to render a refresh tag in the controller. I've resorted to:

format.turbo_stream { render turbo_stream: turbo_stream.action(:refresh, '') }

I'm wondering if I should do up a PR to add refresh as a method on turbo_stream (the tag builder). What do you do when you want to trigger a refresh for the update instigator as well as all the subscribers?

Happy to discuss this elsewhere if you'd rather :D

@radanskoric
Copy link
Author

Out of interest. I noticed it's quite hard to render a refresh tag in the controller. I've resorted to:

format.turbo_stream { render turbo_stream: turbo_stream.action(:refresh, '') }

I'm wondering if I should do up a PR to add refresh as a method on turbo_stream (the tag builder). What do you do when you want to trigger a refresh for the update instigator as well as all the subscribers?

Happy to discuss this elsewhere if you'd rather :D

I would normally suggest moving the discussion to not spam the project maintainer but that's you. 😂

However, in this case, the answer is very simple:

  1. I do exactly as you do.
  2. Yes, it absolutely should be a method on the tag builder.
  3. There's already an open PR for it: Add turbo_stream.refresh builder method hotwired/turbo-rails#595 . If you open it you'll see that, currently, the last comment is mine. :D

If you feel like it, probably best to just add your voice for getting this merged. :)

@brendon
Copy link
Owner

brendon commented Aug 21, 2024

Haha! Totally. I'll leave it at that and will chime in on the other PR :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants