current position:Home>CDH + kylin trilogy II: deployment and setup, Java interview, ask the interviewer questions

CDH + kylin trilogy II: deployment and setup, Java interview, ask the interviewer questions

2022-01-27 02:34:55 Alibaba_ Open Source

 Insert picture description here

restart CDH The server

Because of the modification selinux and swap Set up , You need to restart the operating system to take effect , So please restart CDH The server ;

perform ansible The script to start CDH service (ansible The computer )

  1. wait for CDH Server rebooted successfully ;

  2. Sign in ansible The computer , Get into ~/playbooks Catalog ;

  3. Perform initialization database and start CDH Script for :ansible-playbook cdh-single-start.yml

  4. Start up and output the following information :

 Insert picture description here

  1. ssh Sign in CDH The server , Execute this command to observe CDH How the service started :tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log, When you see the content in the red box below , Indicates that the startup is complete , You can log in with a browser :

 Insert picture description here

Set up ( Browser operation )

Now? CDH The service has started , It can be operated through a browser :

  1. Browser access :http://192.168.50.134:7180 , Here's the picture , The account and password are admin:

 Insert picture description here

  1. All the way next, Select... On the select version page 60 Day experience :

 Insert picture description here

  1. Select host page to see CDH The server (deskmini):

 Insert picture description here

  1. In the choice CDH Version of the page , Please select... In the red box below 5.16.2-1:

 Insert picture description here

  1. Enter the installation Parcel The page of , Due to early upload offline parcle package , So the download progress becomes 100% in a flash , Please wait for the assignment at this time 、 decompression 、 Activation complete :

 Insert picture description here

  1. Here are some recommended actions , Here, select the red box as shown in the figure below , You can skip :

 Insert picture description here

  1. Next is the page for selecting services , I chose a custom service , Then I chose HBase、HDFS、Hive、Hue、Oozie、Spark、YARN、Zookeeper These eight items , It can run Kylin The need for :

 Insert picture description here

  1. On the select host page , All choices CDH The server :

 Insert picture description here

  1. Next is the database settings page , What you fill in must be consistent with the figure below , That is, the host name is localhost,Hive The database of 、 user 、 The codes are all hive,Activity Monitor The database of 、 user 、 The codes are all amon,Reports Manager The database of 、 user 、 The codes are all rman,Oozie Server The database of 、 user 、 The codes are all oozie,Hue The database of 、 user 、 The codes are all hue, These are in ansible It's fixed in the script , The entries here must be consistent :

 Insert picture description here

  1. On the page of setting parameters , Please set according to the actual situation of your hard disk , I'm here /home There's plenty of space in the directory , So the storage locations are all changed to /home Under the table of contents :

 Insert picture description here

  1. Wait for the service to start :

 Insert picture description here

  1. All services are started :

 Insert picture description here

HDFS Set up

  1. As shown in the red box below ,HDFS There is a problem with the service :

 Insert picture description here

  1. Click the red exclamation mark in the figure above

《 A big factory Java Analysis of interview questions + Back end development learning notes + The latest architecture explanation video + Practical project source code handout 》

【docs.qq.com/doc/DSmxTbFJ1cmN1R2dB】 Full content open source sharing

Details of visible problems , Here's the picture , It's a common duplication problem :

 Insert picture description here

  1. Operation as shown below , stay HDFS Parameter setting page of , take dfs.replication Is set to 1( There's only one data node ):

 Insert picture description here

  1. After the above settings , The number of copies has been adjusted to 1, But the number of copies of existing files has not been synchronized , Need to reset ,SSH Log in to CDH Server ;

  2. Carry out orders su - hdfs Switch to hdfs account number , Execute the following command to set the number of copies :

hadoop fs -setrep -R 1 /

  1. Back to the web page , restart HDFS service , Here's the picture :

 Insert picture description here

  1. After restart HDFS Normal service :

 Insert picture description here

YARN Set up

default YARN The parameters are very conservative , You need to make some settings to execute Spark Mission :

  1. Get into YARN Management page ;

  2. As shown in the figure below , Inspection parameters yarn.nodemanager.resource.cpu-vcores Value , The value must be greater than 1, Otherwise submit Spark After the task YARN Do not assign resources to perform tasks ,( If your CDH The server is a virtual machine , When CPU When there is only one core , Then this parameter will be set to 1, The solution is to upgrade the virtual machine first CPU Check the number , Then change this parameter ):

 Insert picture description here

  1. yarn.scheduler.minimum-allocation-mb: Minimum memory that a single container can request , I'm going to set it to zero 1G

  2. yarn.scheduler.maximum-allocation-mb: Maximum memory that a single container can request , I'm going to set it to zero 8G

  3. yarn.nodemanager.resource.memory-mb: Maximum available memory for node , I'm going to set it to zero 8G

  4. The values of these three parameters , It's based on me CDH The server has 32G Memory background , Please adjust yourself according to your hardware resources ;

  5. Restart after setting YARN service , The operation is shown in the figure below :

 Insert picture description here

Spark Set up (CDH The server )

Need to be in Spark Environment prepare a directory and related jar, otherwise Kylin Start will report an error ( Tips spark not found, set SPARK_HOME, or run bin/download-spark.sh), With root identity SSH Sign in CDH The server , Execute the following command :

copyright notice
author[Alibaba_ Open Source],Please bring the original link to reprint, thank you.
https://en.cdmana.com/2022/01/202201270234536363.html

Random recommended