hadoop 3.x 在windows10下编译
https://www.modb.pro/db/133161
https://www.cnblogs.com/jhxxb/p/10765815.html
# 为什么要编译 Hadoop 源码?
Hadoop 主要是基于 Java 语言所编写的,但有也部分需求和功能并不适合 Java 代码开发,所以这部分内容就基于 C++代码去开发,于是就引出了 Native Libraries 概念,我们知道 Java 语言去调用 C++代码是基于 JNI 的方式去调用的,Java 去调用本地的 C++代码打包好的本地库(NativeLibraries)文件去运行。但由于 Linux 系统的本地库文件为.so 格式,而 Windows 的本地库文件格式为 dll,所以为了适应不同操作系统的架构,我们需要进行重新编译下 Hadoop 的源代码。
下面是编译结果。
最重要的是hadoop.dll
和winutils.exe
两个文件,没有这两个文件 hadoop 运行会报错
缺少winutils.exe
java.io.FileNotFoundException: Could not locate Hadoop executable: F:\Program Files\hadoop-3.3.0\bin\winutils.exe -see https://wiki.apache.org/hadoop/WindowsProblems
缺少hadoop.dll
WARN [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2023-06-13 22:46:08,580 WARN [org.apache.hadoop.metrics2.impl.MetricsConfig] - Cannot locate configuration: tried hadoop-metrics2-jobtracker.properties,hadoop-metrics2.properties
2
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
# 接下来,我们来编译 Hadoop 源码
下载源码
源码下载地址(Source download):https://hadoop.apache.org/releases.html (opens new window)
不同版本下载 https://archive.apache.org/dist/hadoop/core/ (opens new window)
这里以 3.3.0 为例,查看源码中的编译说明文件 BUILDING.txt,截取 windows 部分
----------------------------------------------------------------------------------
Building on Windows
----------------------------------------------------------------------------------
Requirements:
* Windows System
* JDK 1.8
* Maven 3.0 or later
* Protocol Buffers 3.7.1
* CMake 3.1 or newer
* Visual Studio 2010 Professional or Higher
* Windows SDK 8.1 (if building CPU rate control for the container executor)
* zlib headers (if building native code bindings for zlib)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
* Unix command-line tools from GnuWin32: sh, mkdir, rm, cp, tar, gzip. These
tools must be present on your PATH.
* Python ( for generation of docs using 'mvn site')
Unix command-line tools are also included with the Windows Git package which
can be downloaded from http://git-scm.com/downloads
If using Visual Studio, it must be Professional level or higher.
Do not use Visual Studio Express. It does not support compiling for 64-bit,
which is problematic if running a 64-bit system.
The Windows SDK 8.1 is available to download at:
http://msdn.microsoft.com/en-us/windows/bg162891.aspx
Cygwin is not required.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 环境准备
# 1.JDK
# 设置系统环境变量
setx /M JAVA_HOME "D:\hadoop\jdk1.8.0_192"
setx /M Path "%Path%;%JAVA_HOME%\bin;%JAVA_HOME%\jre\bin"
2
3
验证
cmd验证:
java -version
结果:
java version "1.8.0_261"
Java(TM) SE Runtime Environment (build 1.8.0_261-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.261-b12, mixed mode)
2
3
4
5
6
# 2.Maven
http://maven.apache.org/download.cgi (opens new window)
https://archive.apache.org/dist/maven/maven-3/ (opens new window)
# 设置系统环境变量
setx /M M2_HOME "D:\hadoop\apache-maven-3.6.1"
setx /M Path "%Path%;%M2_HOME%\bin"
2
3
4
conf\settings.xml 仓库配置
<!-- 本地仓库路径 -->
<localRepository>D:\hadoop\repo</localRepository>
<!-- 网络仓库地址 -->
<mirrors>
<mirror>
<id>central</id>
<mirrorOf>central</mirrorOf>
<name>aliyunmaven</name>
<url>https://maven.aliyun.com/repository/central</url>
</mirror>
<mirror>
<id>apache.snapshots.https</id>
<mirrorOf>apache.snapshots.https</mirrorOf>
<name>aliyunmaven</name>
<url>https://maven.aliyun.com/repository/apache-snapshots</url>
</mirror>
</mirrors>
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
cmd验证:
mvn -version
2
# 3.ProtocolBuffer
https://github.com/protocolbuffers/protobuf/releases/tag/v3.23.1 (opens new window)
注:除了下载 protobuf 源码外,还需要下载相应版本的编译过的用于 Windows 平台的 protoc 命令(protoc-2.5.0-win32.zip), 该命令用于将 .proto 文件转化为 Java 或 C++ 源文件。
将两个压缩包解压,然后将 protoc.exe 复制到 protobuf-2.5.0\src 目录下
# 安装 ProtocolBuffer
cd D:\hadoop\protobuf-2.5.0\java
mvn test
mvn install
# 设置系统环境变量
setx /M Path "%Path%;D:\hadoop\protobuf-2.5.0\src"
cmd验证:
protoc --version
2
3
4
5
6
7
8
9
10
遇见问题
https://github.com/protocolbuffers/protobuf/issues/7313
# 4.CMake
https://cmake.org/download/ (opens new window)
# 设置系统环境变量
setx /M Path "%Path%;D:\hadoop\cmake-3.14.3-win64-x64\bin"
cmd验证:
cmake --version
结果:
cmake version 3.19.4
2
3
4
5
6
7
# 5.Visual Studio 2010 Professional
文件名 cn_visual_studio_2010_professional_x86_dvd_532145.iso
SHA1 33D323446131AB9565082D65C9C380BBD7FF228F
文件大小 2.41GB
发布时间 2010-05-26
ed2k://|file|cn_visual_studio_2010_professional_x86_dvd_532145.iso|2591844352|6001253431AFE573E4344F5A0B1D9CAC|/
2
3
4
5
6
7
# 6. 安装 GetGnuWin32 并配置环境变量
# 7.Zlib
http://www.zlib.net/ (opens new window)
使用 MSVC 方式编译,打开开始菜单,在 Microsoft Visual Studio 2010 下找到 Visual Studio x64 Win64 命令提示 (2010),用管理员身份运行
cd D:\hadoop\zlib-1.2.11
nmake -f win32/Makefile.msc
# 编译完成后在该目录下可看到 zlib1.dll 文件
2
3
4
5
设置系统环境变量
setx /M ZLIB_HOME "D:\hadoop\zlib-1.2.11"
2
# 8.Git(需要 bash 命令)
https://git-scm.com/download/ (opens new window)
# 设置系统环境变量
setx /M Path "%Path%;D:\hadoop\PortableGit\bin"
2
# 编译
1. 设置 Platform
setx /M Platform "x64"
2. 解压源码,开始编译
打开开始菜单,在 Microsoft Visual Studio 2010 下找到 Visual Studio x64 Win64 命令提示 (2010),用管理员身份运行
C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Visual Studio 2022\Visual Studio Tools\VC
cd D:\hadoop\hadoop-3.3.0-src
mvn package -Pdist,native-win -DskipTests -Dtar
# 编译完成后,编译好的文件在目录 hadoop-3.3.0-src\hadoop-dist\target\ 中
2
3
4
5
编译失败时多编译几次,或自己更换下 Maven 源再试,一些依赖的下载不是很稳定。附上编译成功的日志
[INFO] Reactor Summary for Apache Hadoop Main 3.3.0:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [ 39.854 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 35.664 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 13.814 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 13.440 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.205 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 28.307 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 26.019 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [07:47 min]
[INFO] Apache Hadoop Auth ................................. SUCCESS [03:03 min]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 4.259 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [10:30 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 3.421 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [01:23 min]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.064 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:11 min]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [01:45 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 3.279 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 29.455 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [ 23.094 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 2.839 s]
[INFO] Apache Hadoop HDFS-RBF ............................. SUCCESS [ 12.846 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.044 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [ 0.044 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 8.375 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [04:12 min]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 3.536 s]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [ 0.050 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 32.678 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 10.878 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [ 2.167 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [01:36 min]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [ 14.856 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 21.482 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [ 0.589 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 3.674 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 2.209 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 1.911 s]
[INFO] Apache Hadoop YARN Router .......................... SUCCESS [ 3.163 s]
[INFO] Apache Hadoop YARN TimelineService HBase Backend ... SUCCESS [02:35 min]
[INFO] Apache Hadoop YARN Timeline Service HBase tests .... SUCCESS [01:36 min]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [ 0.045 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [ 1.792 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [ 1.213 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [ 0.042 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [ 0.045 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [ 5.462 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [ 0.134 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 23.516 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 13.747 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 2.323 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 6.160 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 4.593 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 2.615 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [ 1.280 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 4.254 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [ 2.127 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 13.367 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 5.503 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 1.260 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [ 1.315 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 3.355 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 2.324 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 1.477 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [ 1.219 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 1.615 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 0.040 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 2.513 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [06:23 min]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 46.260 s]
[INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [01:14 min]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 5.923 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 0.596 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 4.338 s]
[INFO] Apache Hadoop Resource Estimator Service ........... SUCCESS [01:43 min]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 30.751 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 13.557 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.047 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 47.214 s]
[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [ 0.554 s]
[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [ 0.049 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 55:18 min
[INFO] Finished at: 2019-04-25T22:47:26+08:00
[INFO] ------------------------------------------------------------------------
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
Windows 10 编译 Hadoop 2.6.0 源码 - 作业部落 Cmd Markdown 编辑阅读器 (opens new window) (快照 (opens new window))
https://www.cnblogs.com/guoxiaoqian/p/4328812.html (opens new window)
- 01
- idea 热部署插件 JRebel 安装及破解,不生效问题解决04-10
- 02
- spark中代码的执行位置(Driver or Executer)12-12
- 03
- 大数据技术之 SparkStreaming12-12