In continuation to the previous post, using the same example of stations and trips,
scala> val bcStations = sc.broadcast(station.keyBy(_.id).collectAsMap)
bcStations: org.apache.spark.broadcast.Broadcast[scala.collection.Map[Int,Station]] = Broadcast(19)
scala> val bcStations = sc.broadcast(station.keyBy(_.id))
WARN MemoryStore: Not enough space to store block broadcast_20 in memory! Free memory is 252747804 bytes.
WARN MemoryStore: Persisting block broadcast_20 to disk instead.
bcStations: org.apache.spark.broadcast.Broadcast[org.apache.spark.rdd.RDD[(Int, Station)]] = Broadcast(20)
scala> val bcStations = sc.broadcast(station.keyBy(_.id).collectAsMap)
bcStations: org.apache.spark.broadcast.Broadcast[scala.collection.Map[Int,Station]] = Broadcast(19)
scala> val bcStations = sc.broadcast(station.keyBy(_.id))
WARN MemoryStore: Not enough space to store block broadcast_20 in memory! Free memory is 252747804 bytes.
WARN MemoryStore: Persisting block broadcast_20 to disk instead.
bcStations: org.apache.spark.broadcast.Broadcast[org.apache.spark.rdd.RDD[(Int, Station)]] = Broadcast(20)
scala> bcStations.value.take(3).foreach(println)
(83,Station(83,Mezes Park,37.491269,-122.236234,15,Redwood City,2/20/2014))
(77,Station(77,Market at Sansome,37.789625,-122.400811,27,San Francisco,8/25/2013))
(23,Station(23,San Mateo County Center,37.488501,-122.231061,15,Redwood City,8/15/2013))
No comments:
Post a Comment