sql - Hadoop: Why Hive works so slow even on a tiny table? -


I use JDBC from Scala to get data from the hive. In the hive I have a simple table with 20 rows in the following format:

user_id, movie_title, rating, date

I request nested selection to create users group by movie Select different user_id for each user_id: 2) Select different movie_chirtals / select all movies that are viewed by the user. 3) For each movie_title: Select specific user_id / Code>

20 rows L These nested queries 26 minutes on local hive table! The hive first user_id after one minute! Question:

1) Why is the hive so slow?

2) 3 Any way to optimize nested selection?

The Hive used the MapReduce framework to process the questions, the continuous upper boundaries per criteria Has a decent amount of. Each of your questions (which is a reasonable amount due to your nesting hunt) needs to spin a maprand job and it takes time for any data you want.

The new versions of the honeycomb are very much more sensitive but still not ideal for this type of selection.

Your best bet is to try to reduce the number of questions using group or something similar.

Comments

Popular posts from this blog

c - Mpirun hangs when mpi send and recieve is put in a loop -

python - Apply coupon to a customer's subscription based on non-stripe related actions on the site -

java - Unable to get JDBC connection in Spring application to MySQL -