使用虚拟机(MobaXterm)从零开始搭建Spark1.3集群

2023-12-15 04:50

本文主要是介绍使用虚拟机(MobaXterm)从零开始搭建Spark1.3集群,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

准备工作

1. Hadoop:hadoop-2.4.1.tar.gz

2. Spark:下载编译好的基于对应hadoop版本的版本:spark-1.3.0-bin-hadoop2.4.tgz

3.JAVA:jdk-7u80-linux-x64.tar.gz

4. scala:scala-2.11.8.tgz5

 

搭建环境

在MobaXterm界面连接服务器,使用命令sudo virt-manager打开virtual machine manager窗口,使用ubuntu16.04镜像创建一个新的虚拟机,进行如下操作

  • 设置静态ip地址     
    sudo vim /etc/network/interfaces## 更改ip地址,将#iface ens3 inet dhcp改为如下内容
    # The primary network interface
    auto ens3
    iface ens3 inet static
    address 192.168.122.54
    netmask 255.255.255.0
    gateway 192.168.122.1## 重启网络
    /etc/init.d/networking restart## 注意:clone机器后仍然需要修改,我设置的静态ip如下
    master 192.168.122.54
    slave1 192.168.122.55
    slave2 192.168.122.56
    slave3 192.168.122.57
    slave4 192.168.122.58

     

  • 配置hosts文件
    ## 修改主机名
    sudo vim /etc/hostname 
    ## 改为master,clone之后其他再统一修改## 修改hosts文件,只保留localhost,剩余内容进行追加(否则会出问题,后面会提到!)
    sudo vim/etc/hosts192.168.122.54 master
    192.168.122.55 slave1
    192.168.122.56 slave2 
    192.168.122.57 slave3 
    192.168.122.58 slave4 

     

  • 搭建Java环境
    ## 解压
    sudo tar -zxvf jdk-7u80-linux-x64.tar.gz -C ./software/
    sudo mv jdkxxxx jdk## 配置环境变量
    sudo vim /etc/profileexport JAVA_HOME=/home/zmx/software/jdk
    export JRE_HOME=/home/zmx/software/jdk/jre
    export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
    export CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

     

  • 搭建scala环境
    ## 解压
    sudo tar -zxvf scala-2.11.8.tgz -C software/
    sudo mv scala-2.11.8/ scala## 追加环境变量
    sudo vim /etc/profile## 最终效果如图
    export JAVA_HOME=/home/zmx/softwarek
    export JRE_HOME=/home/zmx/softwarek/jre
    export SCALA_HOME=/home/zmx/software/scala
    export PATH=$SCALA_HOME/bin:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
    export CLASSPATH=$CLASSPATH:.:$JAVA_HOMEb:$JAVA_HOME/jreb

     

  • 关闭防火墙
    sudo ufw disable
    Firewall stopped and disabled on system startup

     

  • clone虚拟机:在MobaXterm界面输入sudo virt-manager打开vm manager后进行clone

       

  • 搭建集群,测试机器之间能否通信
    ## clone机器后记得进行ip地址和host的更改,这样我们一共有五台机器:master、slave1-slave4
    ## master上进行如下操作,其他机器同理
    ping slave1
    ping slave2
    ping slave3
    ping slave4

     

  • 配置master-slave ssh 免密登陆
    ## 在每台机器上生成私钥和公钥
    ssh-keygen -t rsa## 将slave上的id_rsa.pub用scp命令发给master
    scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave1
    scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave2
    scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave3
    scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave4## 在master上,将所有公钥加到用于认证的公钥文件authorized_keys中
    zmx@master:~$ cat .ssh/id_rsa.pub* >> ~/.ssh/authorized_keys## 将公钥文件分发给slaves
    scp .ssh/authorized_keys zmx@slave1:~/.ssh/
    scp .ssh/authorized_keys zmx@slave2:~/.ssh/
    scp .ssh/authorized_keys zmx@slave3:~/.ssh/
    scp .ssh/authorized_keys zmx@slave4:~/.ssh/## 最后在每台主机上,用SSH命令,检验下是否能免密码登录
    ssh slave1
    ssh slave2
    ssh slave3
    ssh slave4

安装Hadoop

  配置环境

  • 配置文件
    • hadoop-env.sh
      ## 末尾增加
      export HADOOP_IDENT_STRING=$USER
      export JAVA_HOME=/home/zmx/software/jdk
      export HADOOP_PREFIX=/home/zmx/software/hadoop-2.4.1
      

       

    • yarn-evn.sh
      ## 末尾增加
      export JAVA_HOME=/home/zmx/software/jdk

       

    • slaves: 加入master表示将master也视为slave
      master
      slave1
      slave2
      slave3
      slave4

       

    • core-site.xml
      <?xml version="1.0" encoding="UTF-8"?>
      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
      <!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
      --><!-- Put site-specific property overrides in this file. --><configuration><property><name>fs.defaultFS</name><value>hdfs://master:9000</value></property><property><name>hadoop.tmp.dir</name><value>/home/zmx/software/hadoop-2.4.1/tmp</value></property>
      </configuration>
      

       

    • hdfs-site.xml
      <?xml version="1.0" encoding="UTF-8"?>
      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
      <!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
      --><!-- Put site-specific property overrides in this file. --><configuration><property><name>dfs.datanode.ipc.address</name><value>0.0.0.0:50020</value></property><property><name>dfs.datanode.http.address</name><value>0.0.0.0:50075</value></property><property><name>dfs.replication</name><value>2</value></property>
      </configuration>
      

       

    • mapred-site.xml
      <?xml version="1.0"?>
      <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
      <configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property>
      </configuration>
      

       

    • yarn-site.xml
      <?xml version="1.0"?>
      <!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
      -->
      <configuration><!-- Site specific YARN configuration properties --><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property>  <name>yarn.resourcemanager.address</name>  <value>master:8032</value>  </property> <property><name>yarn.resourcemanager.scheduler.address</name>  <value>master:8030</value>  </property><property><name>yarn.resourcemanager.resource-tracker.address</name>  <value>master:8031</value>  </property>
      </configuration>
      

       

  • 分发文件夹给slave:
    sudo chmod -R 777 ~/software/hadoop-2.4.1
    scp -r ~/software/hadoop-2.4.1 zmx@slave1:~/software/
    scp -r ~/software/hadoop-2.4.1 zmx@slave2:~/software/
    scp -r ~/software/hadoop-2.4.1 zmx@slave3:~/software/
    scp -r ~/software/hadoop-2.4.1 zmx@slave4:~/software/

     

  • 启动hadoop
    ## 进入hadoop目录,格式化hdfs
    bin/hdfs namenode -format## 启动hdfs
    sbin/start-dfs.sh## 启动yarn
    sbin/start-yarn.sh

     

  • 用jps命令查看hadoop进程
    ## master上
    ResourceManager
    SecondaryNameNode
    NameNode
    DataNode(因为把master看成自己的slave所以存在该进程,如果不想这样,将hadoop的配置文件slaves中的master去掉即可)
    NodeManager(因为把master看成自己的slave所以存在该进程)## slave上
    DataNode 
    NodeManager

     

  • 输入yarn node -list查看节点信息
    zmx@master:~/software/hadoop-2.4.1$ yarn node -list
    19/03/18 10:01:32 INFO client.RMProxy: Connecting to ResourceManager at /192.168.122.54:8032
    19/03/18 10:01:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Total Nodes:5Node-Id             Node-State Node-Http-Address       Number-of-Running-Containersslave4:36952                RUNNING       slave4:8042                                  0slave2:39254                RUNNING       slave2:8042                                  0master:38718                RUNNING       master:8042                                  0slave1:42168                RUNNING       slave1:8042                                  0slave3:43401                RUNNING       slave3:8042   

     

 

安装Spark

  • 更改配置文件
    • spark-env.sh:配置几个基本的,可以根据自己机器的实际情况再进行配置
      ​## 添加代码到末尾
      export SCALA_HOME=/home/zmx/software/scala
      export JAVA_HOME=/home/zmx/software/jdk
      export HADOOP_HOME=/home/zmx/software/hadoop-2.4.1
      export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
      SPARK_MASTER_IP=master
      SPARK_LOCAL_DIRS=/home/zmx/software/spark-1.3.0-bin-hadoop2.4
      SPARK_DRIVER_MEMORY=512M

       

    • slaves:加上master表示将master也视为worker,可以不加

      master
      slave1
      slave2
      slave3
      slave4

       

  • 分发给slave
    scp -r ~/software/spark-1.3.0-bin-hadoop2.4 zmx@slave1:~/software/
    scp -r ~/software/spark-1.3.0-bin-hadoop2.4 zmx@slave2:~/software/
    scp -r ~/software/spark-1.3.0-bin-hadoop2.4 zmx@slave3:~/software/
    scp -r ~/software/spark-1.3.0-bin-hadoop2.4 zmx@slave4:~/software/

     

  • 启动Spark
    sbin/start-all.sh##启动成功运行jps可以看到
    master节点进程
    Masterworker节点进程
    Worker

     

使用yarn-cluster模式测试Hadoop和Spark是否安装成功

   参考链接:通过跑较大数据集测试Hadoop 2.4.1是否安装成功

zmx@master:~/software/spark-1.3.0-bin-hadoop2.4$ ./bin/spark-submit \
> --class org.apache.spark.examples.JavaWordCount \
> --master yarn-cluster \
> lib/spark-examples*.jar \
> input/testWordCountFile.txt
Spark assembly has been built with Hive, including Datanucleus jars on classpath
19/03/18 21:04:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/03/18 21:04:06 INFO RMProxy: Connecting to ResourceManager at /192.168.122.54:8032
19/03/18 21:04:06 INFO Client: Requesting a new application from cluster with 5 NodeManagers
19/03/18 21:04:06 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
19/03/18 21:04:06 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
19/03/18 21:04:06 INFO Client: Setting up container launch context for our AM
19/03/18 21:04:06 INFO Client: Preparing resources for our AM container
19/03/18 21:04:07 INFO Client: Uploading resource file:/home/zmx/software/spark-1.3.0-bin-hadoop2.4/lib/spark-assembly-1.3.0-hadoop2.4.0.jar -> hdfs://master:9000/user/zmx/.sparkStaging/application_1552885048834_0014/spark-assembly-1.3.0-hadoop2.4.0.jar
19/03/18 21:04:10 INFO Client: Uploading resource file:/home/zmx/software/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar -> hdfs://master:9000/user/zmx/.sparkStaging/application_1552885048834_0014/spark-examples-1.3.0-hadoop2.4.0.jar
19/03/18 21:04:12 INFO Client: Setting up the launch environment for our AM container
19/03/18 21:04:12 INFO SecurityManager: Changing view acls to: zmx
19/03/18 21:04:12 INFO SecurityManager: Changing modify acls to: zmx
19/03/18 21:04:12 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(zmx); users with modify permissions: Set(zmx)
19/03/18 21:04:12 INFO Client: Submitting application 14 to ResourceManager
19/03/18 21:04:12 INFO YarnClientImpl: Submitted application application_1552885048834_0014
19/03/18 21:04:13 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:13 INFO Client:client token: N/Adiagnostics: N/AApplicationMaster host: N/AApplicationMaster RPC port: -1queue: defaultstart time: 1552914252846final status: UNDEFINEDtracking URL: http://master:8088/proxy/application_1552885048834_0014/user: zmx
19/03/18 21:04:14 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:15 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:16 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:17 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:18 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:19 INFO Client: Application report for application_1552885048834_0014 (state: ACCEPTED)
19/03/18 21:04:20 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:20 INFO Client:client token: N/Adiagnostics: N/AApplicationMaster host: slave4ApplicationMaster RPC port: 0queue: defaultstart time: 1552914252846final status: UNDEFINEDtracking URL: http://master:8088/proxy/application_1552885048834_0014/user: zmx
19/03/18 21:04:21 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:22 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:23 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:24 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:25 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:26 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:27 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:28 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:29 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:30 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:31 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:32 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:33 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:34 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:35 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:37 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:38 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:39 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:40 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:41 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:42 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:43 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:44 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:45 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:46 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:47 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:48 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:49 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:50 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:51 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:52 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:53 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:54 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:55 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:56 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:57 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:58 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:04:59 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:00 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:01 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:02 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:03 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:04 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:05 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:06 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:07 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:08 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:09 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:10 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:11 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:12 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:13 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:14 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:15 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:16 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:17 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:18 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:19 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:20 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:21 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:22 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:23 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:24 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:25 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:26 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:27 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:28 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:29 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:30 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:31 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:32 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:33 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:34 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:35 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:36 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:37 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:38 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:39 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:40 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:41 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:42 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:43 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:44 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:45 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:46 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:47 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:48 INFO Client: Application report for application_1552885048834_0014 (state: RUNNING)
19/03/18 21:05:49 INFO Client: Application report for application_1552885048834_0014 (state: FINISHED)
19/03/18 21:05:49 INFO Client:client token: N/Adiagnostics: N/AApplicationMaster host: slave4ApplicationMaster RPC port: 0queue: defaultstart time: 1552914252846final status: SUCCEEDEDtracking URL: http://master:8088/proxy/application_1552885048834_0014/Auser: zmx

遇到的问题及解决方案

1. Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS 

错误原因:yarn-site.xml配置错误,修改所有节点上yare-site.xml配置文件,在该文件中配置ResourceManager Master节点所在地址即可解决问题

<property>  <name>yarn.resourcemanager.address</name>  <value>master:8032</value>  
</property> 
<property><name>yarn.resourcemanager.scheduler.address</name>  <value>master:8030</value>  
</property>
<property><name>yarn.resourcemanager.resource-tracker.address</name>  <value>mster:8031</value>
</property>  

2. hadoop启动异常,日志如下

java.lang.IllegalArgumentException: Does not contain a valid host:port authority:  master:8031 (configuration property 'yarn.resourcemanager.resource-tracker.address')at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196)at org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1590)at org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.serviceInit(ResourceTrackerService.java:106)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:288)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:871)
2014-03-20 21:00:20,545 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state INITED; cause: java.lang.IllegalArgumentException: Does not contain a valid host:port authority:  master:8031 (configuration property 'yarn.resourcemanager.resource-tracker.address')
java.lang.IllegalArgumentException: Does not contain a valid host:port authority:  master:8031 (configuration property 'yarn.resourcemanager.resource-tracker.address')at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196)at org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1590)at org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.serviceInit(ResourceTrackerService.java:106)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:288)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:871)

yarn-site.xml配置错误,<value>标签之间不能有空格

 

3. hadoop启动异常,INFO org.apache.hadoop.ipc.Client: Retrying connect to server: maste/192.168.122.54:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

问题:/etc/hosts文件多了条记录

127.0.1.1 master

$ netstat -apn|grep 8031
tcp 0 0 127.0.1.1:8031 0.0.0.0:* LISTEN 4964/java

检查端口8031时发现NameNode进程只监听了127.0.1.1:8031端口,除了自己以外其他主机都不连接,所以导致出错,删除所有节点上hosts文件的该记录:127.0.1.1 xxx(xxx为自己的host),重启namenode即可

其他端口出现同样问题,解决方法类似。

 

4. hadoop hdfs格式化遇到问题

FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000. Exiting. 
java.io.IOException: Incompatible clusterIDs in /home/lxh/hadoop/hdfs/data: namenode clusterID = CID-a3938a0b-57b5-458d-841c-d096e2b7a71c; datanode clusterID = CID-200e6206-98b5-44b2-9e48-262871884eebat org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:477)at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:226)at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:254)at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:974)at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:945)at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:278)at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220)at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)at java.lang.Thread.run(Thread.java:745)

问题:命令 bin/hdfs namenode -format执行一次就行了,否则就会出现上述问题

解决方法:如果没有多次format namenode仍出现上述问题,参照链接 https://blog.csdn.net/liuxinghao/article/details/40121843

如果之前多次执行了format命令,则在记录hadoop配置之后,删除所有节点的hadoop文件夹重新在master配置并分发 

 

5. hadoop启动时出现permission denied问题:sbin/start-dfs.sh

   原因:权限不够,执行chmod命令后再进行hadoop文件夹的分发

sudo chmod -R 777 ~/software/hadoop-2.4.1

  

参考链接

  1. 使用虚拟机从小白开始搭建Spark集群
  2. Hadoop-2.4.1完全分布式环境搭建
  3. hadoop 2.6全分布安装
  4. ubuntu关闭防火墙
  5. Hadoop集群从节点出现错误: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s);
  6. hadoop下启动异常之一
  7. Hadoop 集群安装及配置实战
  8. hadoop2.x.x格式化遇到的问题
  9. 重新format namenode后,datanode无法正常启动
  10. Hadoop2.4.1中wordcount示例程序测试过程
  11. Spark On YARN 集群安装部署
  12. SparkDoc:Dynamic Resource Allocation
  13. Spark Dynamic Allocation 分析
  14. spark入门笔记

这篇关于使用虚拟机(MobaXterm)从零开始搭建Spark1.3集群的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/495180

相关文章

Java Thread中join方法使用举例详解

《JavaThread中join方法使用举例详解》JavaThread中join()方法主要是让调用改方法的thread完成run方法里面的东西后,在执行join()方法后面的代码,这篇文章主要介绍... 目录前言1.join()方法的定义和作用2.join()方法的三个重载版本3.join()方法的工作原

Spring AI使用tool Calling和MCP的示例详解

《SpringAI使用toolCalling和MCP的示例详解》SpringAI1.0.0.M6引入ToolCalling与MCP协议,提升AI与工具交互的扩展性与标准化,支持信息检索、行动执行等... 目录深入探索 Spring AI聊天接口示例Function CallingMCPSTDIOSSE结束语

Linux系统之lvcreate命令使用解读

《Linux系统之lvcreate命令使用解读》lvcreate是LVM中创建逻辑卷的核心命令,支持线性、条带化、RAID、镜像、快照、瘦池和缓存池等多种类型,实现灵活存储资源管理,需注意空间分配、R... 目录lvcreate命令详解一、命令概述二、语法格式三、核心功能四、选项详解五、使用示例1. 创建逻

在Java中使用OpenCV实践

《在Java中使用OpenCV实践》用户分享了在Java项目中集成OpenCV4.10.0的实践经验,涵盖库简介、Windows安装、依赖配置及灰度图测试,强调其在图像处理领域的多功能性,并计划后续探... 目录前言一 、OpenCV1.简介2.下载与安装3.目录说明二、在Java项目中使用三 、测试1.测

C++中detach的作用、使用场景及注意事项

《C++中detach的作用、使用场景及注意事项》关于C++中的detach,它主要涉及多线程编程中的线程管理,理解detach的作用、使用场景以及注意事项,对于写出高效、安全的多线程程序至关重要,下... 目录一、什么是join()?它的作用是什么?类比一下:二、join()的作用总结三、join()怎么

mybatis中resultMap的association及collectio的使用详解

《mybatis中resultMap的association及collectio的使用详解》MyBatis的resultMap定义数据库结果到Java对象的映射规则,包含id、type等属性,子元素需... 目录1.reusltmap的说明2.association的使用3.collection的使用4.总

Spring Boot配置和使用两个数据源的实现步骤

《SpringBoot配置和使用两个数据源的实现步骤》本文详解SpringBoot配置双数据源方法,包含配置文件设置、Bean创建、事务管理器配置及@Qualifier注解使用,强调主数据源标记、代... 目录Spring Boot配置和使用两个数据源技术背景实现步骤1. 配置数据源信息2. 创建数据源Be

Java中使用 @Builder 注解的简单示例

《Java中使用@Builder注解的简单示例》@Builder简化构建但存在复杂性,需配合其他注解,导致可变性、抽象类型处理难题,链式编程非最佳实践,适合长期对象,避免与@Data混用,改用@G... 目录一、案例二、不足之处大多数同学使用 @Builder 无非就是为了链式编程,然而 @Builder

在MySQL中实现冷热数据分离的方法及使用场景底层原理解析

《在MySQL中实现冷热数据分离的方法及使用场景底层原理解析》MySQL冷热数据分离通过分表/分区策略、数据归档和索引优化,将频繁访问的热数据与冷数据分开存储,提升查询效率并降低存储成本,适用于高并发... 目录实现冷热数据分离1. 分表策略2. 使用分区表3. 数据归档与迁移在mysql中实现冷热数据分

mybatis-plus QueryWrapper中or,and的使用及说明

《mybatis-plusQueryWrapper中or,and的使用及说明》使用MyBatisPlusQueryWrapper时,因同时添加角色权限固定条件和多字段模糊查询导致数据异常展示,排查发... 目录QueryWrapper中or,and使用列表中还要同时模糊查询多个字段经过排查这就导致只要whe