Skip to content

Conversation

@zmoazeni
Copy link
Contributor

@zmoazeni zmoazeni commented Sep 6, 2017

Description

@shlomi-noach this PR is to extend gh-ost to allow it to connect to the master via the extra port per #482

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code (I believe so, please let me know if I goofed up here)
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

We're going to test out this modified binary this week/next, so I can report back production behavior if you're interested.

I did setup my stuff locally and verified ./test.sh works and ./localtests/test.sh mostly works. Everything except the last test enum-pk passes, but that test fails on master too, so this isn't the cause. Edit: that must just be an inconsistent test. Re-running both ./test.sh works and ./localtests/test.sh passes 100% now.


Both Percona and Maria allow MySQL to be configured to listen on an extra port when their thread pool is enabled.

This is valuable because if the table has a lot of traffic (read or write load), gh-ost can end up starving the thread pool as incoming connections are immediately locked.

By using gh-ost on the extra port, MySQL locking will still behave the same, but MySQL will keep a dedicated thread for each gh-ost connection.

When doing this, it's important to inspect the extra-max-connections variable. Both Percona and Maria default to 1, so gh-ost may easily exceed with its threads.

An example local run using this

$ mysql -S /tmp/mysql_sandbox20393.sock -e "select @@global.port, @@global.extra_port"
+---------------+---------------------+
| @@global.port | @@global.extra_port |
+---------------+---------------------+
|         20393 |               30393 |
+---------------+---------------------+

./bin/gh-ost \
--initially-drop-ghost-table \
--initially-drop-old-table \
--assume-rbr \
--port="20395" \
--assume-master-host="127.0.0.1:30393" \
--max-load=Threads_running=25 \
--critical-load=Threads_running=1000 \
--chunk-size=1000 \
--max-lag-millis=1500 \
--user="gh-ost" \
--password="gh-ost" \
--database="test" \
--table="mytable" \
--verbose \
--alter="ADD mynewcol decimal(11,2) DEFAULT 0.0 NOT NULL" \
--exact-rowcount \
--concurrent-rowcount \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--execute

@zmoazeni zmoazeni force-pushed the enable-extra-port branch 2 times, most recently from 0e222d7 to e42cefc Compare September 8, 2017 11:23
@zmoazeni
Copy link
Contributor Author

zmoazeni commented Sep 8, 2017

We did a run against a lesser read table this morning against the extra port and this branch. Worked great.

if err := this.db.QueryRow(query).Scan(&port, &this.migrationContext.InspectorMySQLVersion); err != nil {
var extraPort int
var versionComment string
if err := this.db.QueryRow(query).Scan(&port, &this.migrationContext.InspectorMySQLVersion, &versionComment); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit could be DRY'd up, but I wanted to get something in front of ya'll first. I can iterate to clean up some of this if you all want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please :)

}
}
if port != this.connectionConfig.Key.Port && extraPort != this.connectionConfig.Key.Port {
return fmt.Errorf("Unexpected database port reported: %+v", port)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if this error message should include the extra port (if it's available). Not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Do let's return log.Errorf("problem reading @@global.extra_port: %+v", err)

}
query := `select @@global.port, @@global.version`

extraPortRegexp := regexp.MustCompile("Percona|Maria")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized yesterday that Enterprise MySQL offers a thread pool, and presumably an extra port as well. I don't have access to a running version of that though (only Community). But this could be extended to include that as well if someone does have access or knows a valid way to differentiate.

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

I'm first wondering whether -assume-master-host=my.master.host.com:30393 would solve the problem in the first place. Con: you'd need to specify master host name (not auto-detected) ; Pro: you specify the port explicitly. Or... I guess the calidateConnection() function then fails, because port != this.connectionConfig.Key.Port. I'm thinking aloud here.

I'm suggesting a small safety net. I don't like that we test for Percona|Maria ; I'm thinking we should instead go brute force:

Attempt to SELECT @@global.extra_port AS gh_ost_attempt_read_extra_port. If error, then we silently keep extraPort == 0 and do not exit with error. If successful, our win.

I'm OK for this to cause an error in MySQL ; this happens once per migration. If it gets logged, then we have the gh_ost_attempt_read_extra_port hint and everyone should be comfortable.

What do you think?

if err := db.QueryRow(extraPortQuery).Scan(&extraPort); err != nil {
return err
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all Percona/MariaDB servers guaranteed to have this variable? If so, since which version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do some digging. I believe Percona ported Maria's code and Maria has had it a while, but I'll try and get some concrete dates/version numbers.

Regardless, I do like your thoughts on brute forcing it. Then we don't even have to check for Percona or Maria (or the early versions without the functionality).

I'll update once I get more info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this was introduced in MariaDB 5.5.21 on 16 Mar 2012:

I found this bit interesting:

Significantly more efficient thread pool, comparable in functionality to the closed source feature in MySQL Enterprise.

So I assume it MySQL enterprise has the same extra_port semantics. I don't have access to it, so I don't know. I also don't see any extra port information in their public documentation https://dev.mysql.com/doc/refman/5.5/en/thread-pool.html


For Percona it was introduced at the same time with version 5.5.29-30.0 on 26 Feb 2012 as beta functionality:

But has persisted in 5.6 and 5.7 as a stable feature.

Percona explicitly mentions that it was ported from Maria's code:

Ported the Thread Pool patch from MariaDB. This feature enables the server to keep the top performance even with the increased number of client connections.

I found the timeline interesting (Percona's release/announcement before Maria). But I assume some folks on Percona's side were keeping up to date on the Maria alpha/pre-merged improvements and brought it over to Percona early.


Completely off topic, but has GitHub evaluated using the Thread Pool for their own workload? I'm just assuming you all don't since you haven't run into this problem. I also imagine you all deal with MANY more connections than we do (total connections that is. you may have better partitioning so per-server connections are around the same).

I'm genuinely curious if you all evaluated it and decided against it for one reason or another. Your experience might influence us to switch back to one-connection-per-thread. You also might be running MySQL Community so this isn't even an option.

If you prefer not to answer that publicly, but still want to share your reasoning, by all means you can also shoot me an email at [email protected]

I have been meaning to write up a blog post but we noticed issues in our upgrade from Percona 5.6 to Percona 5.7. I worked with Percona's team but it's really hard to narrow down what exactly changed behind the scenes in 5.7. My assumption is that something that was blocking concurrency internally in 5.6 was unblocked in 5.7 and the more threads began running concurrently causing us load issues. Particularly on deploys when we cycle thousands of application connections (close down old ones for dying unicorns and spin up new ones for new unicorns).

We could have spent more time on the connection cycling (more sleep/pause between a unicorn dying and a unicorn taking its place) but moving to the thread pool was a decent solution and so far it has worked well for us. Rails 5's schema caching has also helped prevent a thundering herd of queries as our Rails instances boot up, which helped reduce the query load from connections as they startup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely off topic, but has GitHub evaluated using the Thread Pool for their own workload? I'm just assuming you all don't since you haven't run into this problem. I also imagine you all deal with MANY more connections than we do (total connections that is. you may have better partitioning so per-server connections are around the same).

I'm genuinely curious if you all evaluated it and decided against it for one reason or another. Your experience might influence us to switch back to one-connection-per-thread. You also might be running MySQL Community so this isn't even an option.

Nothing to hide. No, we haven't, and the reason is just backlog. There's a lot of MySQL features we just haven't had time to poke yet. Sometime...

}
query := `select @@global.port, @@global.version`

extraPortRegexp := regexp.MustCompile("Percona|Maria")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition for extraPortRegexp repeats itself; can we make a constant in context.go, at least for the pattern?

if err := this.db.QueryRow(query).Scan(&port, &this.migrationContext.InspectorMySQLVersion); err != nil {
var extraPort int
var versionComment string
if err := this.db.QueryRow(query).Scan(&port, &this.migrationContext.InspectorMySQLVersion, &versionComment); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please :)

}
}
if port != this.connectionConfig.Key.Port && extraPort != this.connectionConfig.Key.Port {
return fmt.Errorf("Unexpected database port reported: %+v", port)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Do let's return log.Errorf("problem reading @@global.extra_port: %+v", err)

@zmoazeni
Copy link
Contributor Author

I'm first wondering whether -assume-master-host=my.master.host.com:30393 would solve the problem in the first place. Con: you'd need to specify master host name (not auto-detected) ; Pro: you specify the port explicitly. Or... I guess the calidateConnection() function then fails, because port != this.connectionConfig.Key.Port. I'm thinking aloud here.

Yeah I used -assume-master-host=my.master.host.com:30393 and that bit works swell. I don't expect gh-ost to just know I want it to connect on the extra port. But these validateConnection() functions fail hard because they're trying to protect the user. That's understandable.

One other option I thought we could is instead of querying to see if the extra port is enabled and if it matches the args passed, we could offer a new switch to just ignore validating the connection. That's a bit dangerous but it is lower tech (e.g. easier to maintain) and it would work. Let me know if you prefer that option instead.

I'm suggesting a small safety net. I don't like that we test for Percona|Maria ; I'm thinking we should instead go brute force:

Attempt to SELECT @@global.extra_port AS gh_ost_attempt_read_extra_port. If error, then we silently keep extraPort == 0 and do not exit with error. If successful, our win.

I have zero objection to this. I'll make it so.

@shlomi-noach
Copy link
Contributor

I'm also wondering whether gh-ost should issue a warning if it starts up and notices there's a thread pool; perhaps it should indicate that the user should try and use the extra_port, or actually just go ahead and use extra_port implicitly?

@zmoazeni
Copy link
Contributor Author

Hrm. I can't answer that one. I'm really not a thread pool expert.

One thing that needs to be set is the extra_port_max_connections though. That defaults to 1 and gh-ost quickly exhausts that limit. So I am wary about automatically choosing that.

I'm not so concerned about the warning however it could be annoying if someone knows what they're doing and can't turn it off.

I was thinking of putting together a doc all about using gh-ost + thread pool, so perhaps that can help educate.

I'm planning on spending time on these updates next week.

@shlomi-noach
Copy link
Contributor

One thing that needs to be set is the extra_port_max_connections though. That defaults to 1 and gh-ost quickly exhausts that limit. So I am wary about automatically choosing that.

Good point.

I'm not so concerned about the warning however it could be annoying if someone knows what they're doing and can't turn it off.

I'd say log.Warn("hey watch out....") is a responsible thing to do.

I was thinking of putting together a doc all about using gh-ost + thread pool, so perhaps that can help educate.
I'm planning on spending time on these updates next week.
Merge state

Thank you so much!

Both Percona and Maria allow MySQL to be configured to listen on an extra port when their thread pool is enable.

* https://www.percona.com/doc/percona-server/5.7/performance/threadpool.html
* https://mariadb.com/kb/en/the-mariadb-library/thread-pool-in-mariadb-51-53/

This is valuable because if the table has a lot of traffic (read or write load), gh-ost can end up starving the thread pool as incomming connections are immediately blocked.

By using gh-ost on the extra port, MySQL locking will still behave the same, but MySQL will keep a dedicated thread for each gh-ost connection.

When doing this, it's important to inspect the extra-max-connections variable. Both Percona and Maria default to 1, so gh-ost may easily exceed with its threads.

An example local run using this

```
$ mysql -S /tmp/mysql_sandbox20393.sock -e "select @@global.port, @@global.extra_port"
+---------------+---------------------+
| @@global.port | @@global.extra_port |
+---------------+---------------------+
|         20393 |               30393 |
+---------------+---------------------+

./bin/gh-ost \
--initially-drop-ghost-table \
--initially-drop-old-table \
--assume-rbr \
--port="20395" \
--assume-master-host="127.0.0.1:30393" \
--max-load=Threads_running=25 \
--critical-load=Threads_running=1000 \
--chunk-size=1000 \
--max-lag-millis=1500 \
--user="gh-ost" \
--password="gh-ost" \
--database="test" \
--table="mytable" \
--verbose \
--alter="ADD mynewcol decimal(11,2) DEFAULT 0.0 NOT NULL" \
--exact-rowcount \
--concurrent-rowcount \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--execute
```
Copy link
Contributor Author

@zmoazeni zmoazeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shlomi-noach Alrighty man. I'm going to have to apologize on the outset, I'm not a strong Go hacker. So there are no doubt Go-isms that I'm missing/breaking. Just let me know, and I can update the code.

This 1) is rebased against master and 2) addresses your main concern: we brute force the check for the extra_port instead of seeing if it's maybe available.

// swallow this error. not all servers support extra_port
}

if connectionConfig.Key.Port == port || (extraPort > 0 && connectionConfig.Key.Port == extraPort) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworked this conditional into something that felt a bit better. This is easier for me to grok. But let me know if you see something else that I can change.

} else if extraPort == 0 {
return "", fmt.Errorf("Unexpected database port reported: %+v", port)
} else {
return "", fmt.Errorf("Unexpected database port reported: %+v / extra_port: %+v", port, extraPort)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to call out this error message. Let me know if you want me to tweak the format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine.

return err
}
if err := this.validateConnection(this.singletonDB); err != nil {
if _, err := base.ValidateConnection(this.singletonDB, this.connectionConfig); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exposed kind of a problem with the original code. We would overwrite the ApplierMySQLVersion with this second call. I almost wrote it that way, but I wasn't entirely sure about this.db vs this.singletonDB. So let know how you want me to handle it.

I chose the first connection this.db to set the version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into it.

log.Infof("connection validated on %+v", this.connectionConfig.Key)
return nil

version, err := base.ValidateConnection(this.db, this.connectionConfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to keep around this private validateConnection function since you added the password length check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll consolidate and add a password-check flag

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! I will merge into a enable-extra-port branch upstream and make a couple more changes as per comments.

} else if extraPort == 0 {
return "", fmt.Errorf("Unexpected database port reported: %+v", port)
} else {
return "", fmt.Errorf("Unexpected database port reported: %+v / extra_port: %+v", port, extraPort)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine.

return err
}
if err := this.validateConnection(this.singletonDB); err != nil {
if _, err := base.ValidateConnection(this.singletonDB, this.connectionConfig); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into it.

log.Infof("connection validated on %+v", this.connectionConfig.Key)
return nil

version, err := base.ValidateConnection(this.db, this.connectionConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll consolidate and add a password-check flag

@shlomi-noach shlomi-noach changed the base branch from master to enable-extra-port October 2, 2017 11:26
@shlomi-noach shlomi-noach merged commit 9890e66 into github:enable-extra-port Oct 2, 2017
@shlomi-noach shlomi-noach mentioned this pull request Oct 2, 2017
@zmoazeni
Copy link
Contributor Author

zmoazeni commented Oct 2, 2017

Thank YOU! And the rest of the gh-ost contributors!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants