Datasets of grammatically uncommon sentences

I’m trying to train a model which replaces grammatically correct but uncommon sentences, with their more common counterparts. So I’m looking for any datasets of grammatically uncommon sentences, paired with their more common versions.
For eg.

  1. Already, enough punishment had been given.
  2. Enough punishment had been given already.
  3. Enough punishment had already been given.

All sentences say the same thing, but No. 3 is the most likely version you would encounter in English.

Another eg.

  1. Either by you or someone else, the bill must be paid.
  2. Either you or someone else must pay the bill.
  3. The bill must be paid either by you or someone else.

No. 1 would be the unlikely version. No. 2 & 3 are more likely.

Any suggestions where I can find these?


© Copyright 2013-2019 Analytics Vidhya