machinelearning: Sample fails with "The size of input lines is not consistent"

I’m trying out the sample shown here. However, whenever I try to train the model I get an error: “The size of input lines is not consistent”. This is using the exact files that are specified in the tutorial so I’m not sure where I’m going wrong - any ideas?

#r "netstandard"
#load @"C:\Users\Isaac\Source\Repos\scratchpad\.paket\load\netstandard2.0\ML\ml.group.fsx"

open Microsoft.ML
open Microsoft.ML.Runtime.Api
open Microsoft.ML.Transforms
open Microsoft.ML.Trainers

let dataPath = @"data\imdb_labelled.txt"
let testDataPath = @"data\yelp_labelled.txt"

type SentimentData =
    { [<Column(ordinal = "0")>] SentimentText : string
      [<Column(ordinal = "1", name = "Label")>] Sentiment : float }

[<CLIMutable>]
type SentimentPrediction =
    { [<ColumnName "PredictedLabel">] Sentiment : bool }

let pipeline = LearningPipeline()
pipeline.Add(TextLoader<SentimentData>(dataPath, useHeader = false, separator = "tab"))
pipeline.Add(TextFeaturizer("Features", "SentimentText"))
pipeline.Add(FastTreeBinaryClassifier(NumLeaves = 5, NumTrees = 5, MinDocumentsInLeafs = 2))

/// Pop!
let model = pipeline.Train<SentimentData, SentimentPrediction>()

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 19 (11 by maintainers)

Most upvoted comments

Are native folders a “proper” thing in NuGet packages?

Check out https://docs.microsoft.com/en-us/nuget/create-packages/supporting-multiple-target-frameworks#architecture-specific-folders for the docs on the runtimes folder:

If you have architecture-specific assemblies, that is, separate assemblies that target ARM, x86, and x64, you must place them in a folder named runtimes within sub-folders named {platform}-{architecture}\lib{framework} or {platform}-{architecture}\native