efcore: Consider more efficient pattern for seed data that will never change
I am a bit perplexed.
My first migration contains seeding data for the table Countries - 249 rows. This initialization data exists in InitMigration.Up step (Ok), in ModelSnapshot.BuildModel (Ok) and in InitMigration.BuildTargetModel (Ok?).
I added the second migration, and I see that countries data is again present in BuildTargetModel of this migration. Does this mean that model is recreated from scratch for every migration? What if my initial migration contained several thousands of rows for seeded data? Would source code of each subsequent migration have size of megabytes?
Here’s the code for configuring Country:
sealed class CountryConfiguration : IEntityTypeConfiguration<Country>
{
public void Configure(EntityTypeBuilder<Country> builder)
{
builder.ToTable("Country");
builder.HasOne<WorldRegion>()
.WithMany()
.HasForeignKey(m => m.RegionId)
.IsRequired(false)
.OnDelete(DeleteBehavior.SetNull);
builder.HasAlternateKey(m => m.Name);
builder.HasAlternateKey(m => m.NumericCode);
builder.HasAlternateKey(m => m.Alpha2Code);
builder.HasAlternateKey(m => m.Alpha3Code);
object[] countriesData = Geography.List.Countries().ToArray();
builder.HasData(countriesData);
}
}
And here is the code for getting countries seeding data:
namespace Geography
open System
open FSharp.Data
open System.Collections.Generic
module List =
type Region = {Id: int; Name:string; NumericCode: string}
type Country = {
Id:int;
RegionId: Nullable<int>;
Name: string;
NumericCode: string;
Alpha2Code: string;
Alpha3Code: string;
}
[<Literal>] let private ``countries csv file`` = "./countries.csv"
type private CountriesProvider = CsvProvider<
``countries csv file``,
Schema = "name,alpha-2,alpha-3,country-code (string),iso_3166-2,region,sub-region,intermediate-region,region-code (string),sub-region-code,intermediate-region-code"
>
let private countriesFile() = CountriesProvider.Load(``countries csv file``)
let Regions(): IEnumerable<Region> =
countriesFile().Rows
|> Seq.distinctBy (fun row -> row.``Region-code``)
|> Seq.mapi (fun index row -> {Id = index+1; Name = row.Region; NumericCode = row.``Region-code``})
|> Seq.filter (fun r -> not (String.IsNullOrWhiteSpace r.Name))
let RegionsByCode() =
Regions()
|> Seq.map (fun r -> (r.NumericCode, r))
|> dict
let Countries(): IEnumerable<Country> =
let regionsByCode = RegionsByCode()
countriesFile().Rows
|> Seq.distinctBy (fun row -> row.``Country-code``)
|> Seq.mapi (
fun index row ->
let regionId =
let exists, r = regionsByCode.TryGetValue row.``Region-code``
if exists
then Nullable<int> r.Id
else Unchecked.defaultof<Nullable<int>>
{
Id = index+1;
RegionId = regionId;
Name = row.Name;
NumericCode = row.``Country-code``;
Alpha2Code = row.``Alpha-2``;
Alpha3Code = row.``Alpha-3``;
})
let CountriesByCode() =
Countries()
|> Seq.map (fun c -> (c.NumericCode, c))
|> dict
Country is a reference table, so no change mechanism besides migrations is planned. Am I abusing data seeding or is it somehow possible to avoid codebase bloating?
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 22 (11 by maintainers)
@voroninp #2174
@voroninp We discussed this again in triage and we may consider doing something to make this more efficient when the seed data will never change. However, for now as @bricelam said above the appropriate workaround is:
You’ll still need a project reference. The compiled assembly just won’t have an assembly reference.
ModelSnapshot is a snapshot of the model from the last time you ran Add-Migration. It’s used to get the diff the next time you call Add-Migration.
Migration.BuildTargetModel is a copy of the model for that migration. It’s used when the steps in the Up method aren’t enough. For example, we use it to rebuild indexes on SQL Server when a column type is narrowed from int to smallint. It’s rarely used, but enables scenarios that would otherwise require the user to manually add additional steps to the Up method.