Class/Object

org.apache.spark.ml.recommendation

GlintFMPair

Related Docs: object GlintFMPair | package recommendation

Permalink

class GlintFMPair extends Estimator[GlintFMPairModel] with GlintFMPairParams with DefaultParamsWritable

Distributed pairwise factorization machine / LightFM.

Pairwise factorization machines are trained on implicit-feedback training instances to rank all items with observed user-item training instances above all other items for the user, using bayesian personalized ranking.

This is an implementation using Glint parameter servers with custom methods for network-efficient training. A Spark application with the parameter servers has to be started beforehand and the host of the parameter server master given as parameter to this implementation.

Linear Supertypes
DefaultParamsWritable, MLWritable, GlintFMPairParams, HasPredictionCol, HasSeed, HasStepSize, HasMaxIter, Estimator[GlintFMPairModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GlintFMPair
  2. DefaultParamsWritable
  3. MLWritable
  4. GlintFMPairParams
  5. HasPredictionCol
  6. HasSeed
  7. HasStepSize
  8. HasMaxIter
  9. Estimator
  10. PipelineStage
  11. Logging
  12. Params
  13. Serializable
  14. Serializable
  15. Identifiable
  16. AnyRef
  17. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GlintFMPair()

    Permalink
  2. new GlintFMPair(uid: String)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. def aggFeatureProbabilities(df: DataFrame, numCols: Int): Array[Float]

    Permalink

    Computes the feature probabilities of sparse feature vector columns through mapPartitions for efficiency and through treeReduce to avoid OOM on the driver

    Computes the feature probabilities of sparse feature vector columns through mapPartitions for efficiency and through treeReduce to avoid OOM on the driver

    df

    The dataframe to aggregate, must have sparse vector columns starting at column 0

    numCols

    The number of sparse vector columns

  6. def aggWeightedFeatureProbabilities(df: DataFrame): Array[Float]

    Permalink

    Computes the weighted feature probabilities of sparse feature vector columns through mapPartitions for efficiency and through treeReduce to avoid OOM on the driver

    Computes the weighted feature probabilities of sparse feature vector columns through mapPartitions for efficiency and through treeReduce to avoid OOM on the driver

    df

    The dataframe to aggregate, must have a double weight column at column 0 and sparse vector column at column 1

  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. final val batchSize: IntParam

    Permalink

    The per-worker mini-batch size Default: 256

    The per-worker mini-batch size Default: 256

    Definition Classes
    GlintFMPairParams
  9. final def clear(param: Param[_]): GlintFMPair.this.type

    Permalink
    Definition Classes
    Params
  10. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  11. def copy(extra: ParamMap): Estimator[GlintFMPairModel]

    Permalink
    Definition Classes
    GlintFMPair → Estimator → PipelineStage → Params
  12. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  13. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  14. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  16. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  17. def explainParams(): String

    Permalink
    Definition Classes
    Params
  18. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  19. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  20. final val factorsReg: FloatParam

    Permalink

    The regularization rate for the latent factor weights Default: 0.001f

    The regularization rate for the latent factor weights Default: 0.001f

    Definition Classes
    GlintFMPairParams
  21. final val filterItemsCol: Param[String]

    Permalink

    The name of the integer arrays column containing the itemCol ids of the items to filter from recommendations.

    The name of the integer arrays column containing the itemCol ids of the items to filter from recommendations. If empty, recommendations are not filtered. Usually the arrays will contain the ids of the items of the user

    Default: ""

    Definition Classes
    GlintFMPairParams
  22. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  23. def fit(dataset: Dataset[_]): GlintFMPairModel

    Permalink

    Fits a GlintFMPairModel on the data set

    Fits a GlintFMPairModel on the data set

    dataset

    The data set containing columns (userCol: Int, itemCol: Int, itemFeaturesCol: SparseVector, userctxFeaturesCol: SparseVector) and if acceptance sampling should be used also samplingCol.

    Definition Classes
    GlintFMPair → Estimator
  24. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[GlintFMPairModel]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  25. def fit(dataset: Dataset[_], paramMap: ParamMap): GlintFMPairModel

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  26. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): GlintFMPairModel

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  27. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  28. def getBatchSize: Int

    Permalink

    Definition Classes
    GlintFMPairParams
  29. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  30. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  31. def getFactorsReg: Float

    Permalink

    Definition Classes
    GlintFMPairParams
  32. def getFilterItemsCol: String

    Permalink

    Definition Classes
    GlintFMPairParams
  33. def getItemCol: String

    Permalink

    Definition Classes
    GlintFMPairParams
  34. def getItemfeaturesCol: String

    Permalink

    Definition Classes
    GlintFMPairParams
  35. def getLinearReg: Float

    Permalink

    Definition Classes
    GlintFMPairParams
  36. def getLoadMetadata: Boolean

    Permalink

    Definition Classes
    GlintFMPairParams
  37. final def getMaxIter: Int

    Permalink
    Definition Classes
    HasMaxIter
  38. def getMetadataPath: String

    Permalink

    Definition Classes
    GlintFMPairParams
  39. def getNumDims: Int

    Permalink

    Definition Classes
    GlintFMPairParams
  40. def getNumParameterServers: Int

    Permalink

    Definition Classes
    GlintFMPairParams
  41. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  42. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  43. def getParameterServerConfig: Config

    Permalink

    Definition Classes
    GlintFMPairParams
  44. def getParameterServerHost: String

    Permalink

    Definition Classes
    GlintFMPairParams
  45. final def getPredictionCol: String

    Permalink
    Definition Classes
    HasPredictionCol
  46. def getRho: Double

    Permalink

    Definition Classes
    GlintFMPairParams
  47. def getSampler: String

    Permalink

    Definition Classes
    GlintFMPairParams
  48. def getSamplingCol: String

    Permalink

    Definition Classes
    GlintFMPairParams
  49. def getSaveMetadata: Boolean

    Permalink

    Definition Classes
    GlintFMPairParams
  50. final def getSeed: Long

    Permalink
    Definition Classes
    HasSeed
  51. final def getStepSize: Double

    Permalink
    Definition Classes
    HasStepSize
  52. def getTreeDepth: Int

    Permalink

    Definition Classes
    GlintFMPairParams
  53. def getUserCol: String

    Permalink

    Definition Classes
    GlintFMPairParams
  54. def getUserctxfeaturesCol: String

    Permalink

    Definition Classes
    GlintFMPairParams
  55. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  56. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  57. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  58. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  59. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  60. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  61. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  62. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  63. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  64. final val itemCol: Param[String]

    Permalink

    The name of the item id column of integers from 0 to number of items in training dataset Default: "itemid"

    The name of the item id column of integers from 0 to number of items in training dataset Default: "itemid"

    Definition Classes
    GlintFMPairParams
  65. final val itemfeaturesCol: Param[String]

    Permalink

    The name of the item feature column of sparse vectors Default: "itemfeatures"

    The name of the item feature column of sparse vectors Default: "itemfeatures"

    Definition Classes
    GlintFMPairParams
  66. final val linearReg: FloatParam

    Permalink

    The regularization rate for the linear weights Default: 0.01f

    The regularization rate for the linear weights Default: 0.01f

    Definition Classes
    GlintFMPairParams
  67. final val loadMetadata: BooleanParam

    Permalink

    Whether the meta data of the data frame to fit should be loaded from HDFS.

    Whether the meta data of the data frame to fit should be loaded from HDFS. This allows skipping the meta data computation stages when fitting on the same data frame with different parameters. Meta data for "cross-batch" and "uniform" sampling is intercompatible but "exp" requires its own meta data

    Default: false

    Definition Classes
    GlintFMPairParams
  68. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  69. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  70. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  71. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  72. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  73. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  74. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  75. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  76. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  77. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  78. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  79. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  80. final val maxIter: IntParam

    Permalink
    Definition Classes
    HasMaxIter
  81. final val metadataPath: Param[String]

    Permalink

    The HDFS path to load meta data for the fit data frame from or to save the fitted meta data to.

    The HDFS path to load meta data for the fit data frame from or to save the fitted meta data to. Default: ""

    Definition Classes
    GlintFMPairParams
  82. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  83. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  84. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  85. final val numDims: IntParam

    Permalink

    The number of latent factor dimensions (k) Default: 150

    The number of latent factor dimensions (k) Default: 150

    Definition Classes
    GlintFMPairParams
  86. final val numParameterServers: IntParam

    Permalink

    The number of parameter servers Default: 3

    The number of parameter servers Default: 3

    Definition Classes
    GlintFMPairParams
  87. final val parameterServerConfig: Param[Config]

    Permalink

    The parameter server configuration.

    The parameter server configuration. Allows for detailed configuration of the parameter servers with the default configuration as fallback. Default: ConfigFactory.empty()

    Definition Classes
    GlintFMPairParams
  88. final val parameterServerHost: Param[String]

    Permalink

    The master host of the running parameter servers.

    The master host of the running parameter servers. If this is not set a standalone parameter server cluster is started in this Spark application. Default: ""

    Definition Classes
    GlintFMPairParams
  89. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  90. final val predictionCol: Param[String]

    Permalink
    Definition Classes
    HasPredictionCol
  91. final val rho: DoubleParam

    Permalink

    The rho value to use for the "exp" sampler.

    The rho value to use for the "exp" sampler. Has to be between 0.0 and 1.0 Default: 1.0

    Definition Classes
    GlintFMPairParams
  92. final val sampler: Param[String]

    Permalink

    The sampler to use.

    The sampler to use.

    "uniform" means sampling negative items uniformly, as originally proposed for BPR.

    "exp" means sampling negative items with probability proportional to their exponential popularity distribution, as proposed in LambdaFM.

    "crossbatch" means sampling negative items uniformly, but sharing them across the mini-batch as crossbatch-BPR loss, as proposed in my masters thesis.

    Default: "uniform"

    Definition Classes
    GlintFMPairParams
  93. final val samplingCol: Param[String]

    Permalink

    The name of the column of integers to use for sampling.

    The name of the column of integers to use for sampling. If empty all items are accepted as negative items otherwise only items where there does not exist an interaction between the user and the sampling column value of the item. Usually the sampling column is the same as itemCol but it may also be another column with an n-to-1 relation from item column value to sampling column value.

    Consider the example of playlists with "pid" as user column amd tracks with "traid" as item column. Another column "artid" holds the artist of the track. With "traid" as sampling column, only tracks which are not in the playlist are accepted as negative items. With "artid" as sampling column, only tracks whose artists are not in the playlist are accepted as negative item.

    Default: ""

    Definition Classes
    GlintFMPairParams
  94. def save(path: String): Unit

    Permalink
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  95. final val saveMetadata: BooleanParam

    Permalink

    Whether the meta data of the fitted data frame should be saved to HDFS.

    Whether the meta data of the fitted data frame should be saved to HDFS. Default: false

    Definition Classes
    GlintFMPairParams
  96. final val seed: LongParam

    Permalink
    Definition Classes
    HasSeed
  97. final def set(paramPair: ParamPair[_]): GlintFMPair.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  98. final def set(param: String, value: Any): GlintFMPair.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  99. final def set[T](param: Param[T], value: T): GlintFMPair.this.type

    Permalink
    Definition Classes
    Params
  100. def setBatchSize(value: Int): GlintFMPair.this.type

    Permalink

  101. final def setDefault(paramPairs: ParamPair[_]*): GlintFMPair.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  102. final def setDefault[T](param: Param[T], value: T): GlintFMPair.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  103. def setFactorsReg(value: Float): GlintFMPair.this.type

    Permalink

  104. def setFilterItemsCol(value: String): GlintFMPair.this.type

    Permalink

  105. def setItemCol(value: String): GlintFMPair.this.type

    Permalink

  106. def setItemFeaturesCol(value: String): GlintFMPair.this.type

    Permalink

  107. def setLinearReg(value: Float): GlintFMPair.this.type

    Permalink

  108. def setLoadMetadata(value: Boolean): GlintFMPair.this.type

    Permalink

  109. def setMaxIter(value: Int): GlintFMPair.this.type

    Permalink

  110. def setMetadataPath(value: String): GlintFMPair.this.type

    Permalink

  111. def setNumDims(value: Int): GlintFMPair.this.type

    Permalink

  112. def setNumParameterServers(value: Int): GlintFMPair.this.type

    Permalink

  113. def setParameterServerConfig(value: Config): GlintFMPair.this.type

    Permalink

  114. def setParameterServerHost(value: String): GlintFMPair.this.type

    Permalink

  115. def setPredictionCol(value: String): GlintFMPair.this.type

    Permalink

  116. def setSampler(value: String): GlintFMPair.this.type

    Permalink

  117. def setSamplingCol(value: String): GlintFMPair.this.type

    Permalink

  118. def setSaveMetadata(value: Boolean): GlintFMPair.this.type

    Permalink

  119. def setSeed(value: Long): GlintFMPair.this.type

    Permalink

  120. def setStepSize(value: Double): GlintFMPair.this.type

    Permalink

  121. def setUserCol(value: String): GlintFMPair.this.type

    Permalink

  122. def setUserctxFeaturesCol(value: String): GlintFMPair.this.type

    Permalink

  123. val stepSize: DoubleParam

    Permalink
    Definition Classes
    HasStepSize
  124. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  125. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  126. def transformSchema(schema: StructType): StructType

    Permalink
    Definition Classes
    GlintFMPair → PipelineStage
  127. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  128. final val treeDepth: IntParam

    Permalink

    The depth to use for tree reduce when computing the meta data.

    The depth to use for tree reduce when computing the meta data. To avoid OOM errors, this has to be set sufficiently large but lower depths might lead to faster runtimes

    Definition Classes
    GlintFMPairParams
  129. val uid: String

    Permalink
    Definition Classes
    GlintFMPair → Identifiable
  130. final val userCol: Param[String]

    Permalink

    The name of the user id column of integers Default: "userid"

    The name of the user id column of integers Default: "userid"

    Definition Classes
    GlintFMPairParams
  131. final val userctxfeaturesCol: Param[String]

    Permalink

    The name of the user and context feature column of sparse vectors Default: "userctxfeatures"

    The name of the user and context feature column of sparse vectors Default: "userctxfeatures"

    Definition Classes
    GlintFMPairParams
  132. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  133. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  134. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  135. def write: MLWriter

    Permalink
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from GlintFMPairParams

Inherited from HasPredictionCol

Inherited from HasSeed

Inherited from HasStepSize

Inherited from HasMaxIter

Inherited from Estimator[GlintFMPairModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

getParam

param

setParam

Ungrouped