Windows slaves fail to start via DCOM

Skip to end of metadata
Go to start of metadata

If you choose "Let Jenkins control this Windows slave as a Windows service" for connecting to slave (see Windows Slaves Plugin), you may get an error message like this:

Access is denied. [0x00000005]
	at org.jinterop.dcom.core.JIComServer.init(JIComServer.java:542)
	at org.jinterop.dcom.core.JIComServer.initialise(JIComServer.java:458)
	at org.jinterop.dcom.core.JIComServer.<init>(JIComServer.java:427)
	at org.jvnet.hudson.wmi.WMI.connect(WMI.java:41)
	at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:107)
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:170)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: rpc.FaultException: Received fault. (unknown)
	at rpc.ConnectionOrientedEndpoint.call(ConnectionOrientedEndpoint.java:142)
	at rpc.Stub.call(Stub.java:112)
	at org.jinterop.dcom.core.JIComServer.init(JIComServer.java:538)
	... 10 more

If so, check the following settings one after the other on Windows

Windows account related issues

Local "Administrator" group membership

Make sure that the user name you have entered is a member of the local "Administrators" group. In the default Windows installation, this group membership is required for Jenkins to remotely copy files and install a service.

Slave under domain account

If your slave is running under a domain account and you get an error code 0x800703FA, change a group policy:

  • open the group policy editor (gpedit.msc)
  • go to Computer Configuration->Administrative Templates->System-> UserProfiles, "Do not forcefully unload the user registry at user logoff"
  • Change the setting from "Not Configured" to "Enabled", which disables the new User Profile Service feature ('DisableForceUnload' is the value added to the registry)

Credit to Oliver Walsh (see comments below)

Windows networking related issues

Firewall

By default, Windows Firewall prevents the TCP connections necessary to make this mechanism work. The firewall on the slave must allow the following exceptions (see List of TCP&UDP port numbers):

  • TCP Port 135 (DCE/RPC Locator service)
  • TCP Port 139 (NetBIOS Session Service)
  • TCP Port 445 (Windows shares)
  • C:\WINDOWS\system32\dllhost.exe (dllhost.exe seems to use a random port number)
  • C:\WINDOWS\system32\javaw.exe (Jenkins also uses a random port number)
  • File and Printer sharing (TCP 139, TCP 445, UDP 137, UDP 138 (possibly only a subset of these is required))

The easiest way to track down firewall issues is to use tcpdump. Just run the following command on the Jenkins server, which is trying to connect to the slave:

Linux/UNIX:

tcpdump -n -i <IF> -s 1500 port not 22 and host <HOST-IP>

<IF>       the network interface name, e.g. eth1
<HOST-IP>  the IP address of the slave

Ports 139 and 445

When the Ports 139 (NetBIOS Session Service) and 445 (Windows Shares) are not available, the following error message appears:

ERROR: Message not found for errorCode: 0xC0000001
org.jinterop.dcom.common.JIException: Message not found for errorCode: 0xC0000001
 at org.jinterop.winreg.smb.JIWinRegStub.winreg_OpenHKCR(JIWinRegStub.java:121)
 at org.jinterop.dcom.core.JIComServer.initialise(JIComServer.java:479)
 at org.jinterop.dcom.core.JIComServer.<init>(JIComServer.java:427)
 at org.jvnet.hudson.wmi.WMI.connect(WMI.java:41)
 at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:137)
 at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:184)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
Caused by: jcifs.smb.SmbException:
Connection timeout jcifs.util.transport.TransportException: Connection timeout
 at jcifs.util.transport.Transport.connect(Transport.java:178)
 at jcifs.smb.SmbTransport.connect(SmbTransport.java:294)
 at jcifs.smb.SmbTree.treeConnect(SmbTree.java:141)
 at jcifs.smb.SmbFile.doConnect(SmbFile.java:858)
 at jcifs.smb.SmbFile.connect(SmbFile.java:901)
 at jcifs.smb.SmbFile.connect0(SmbFile.java:827)
 at jcifs.smb.SmbFileInputStream.<init>(SmbFileInputStream.java:76)
 at jcifs.smb.SmbFileInputStream.<init>(SmbFileInputStream.java:65)
 at jcifs.smb.SmbFile.getInputStream(SmbFile.java:2784)
 at rpc.ncacn_np.RpcTransport.attach(RpcTransport.java:90)
 at rpc.Stub.attach(Stub.java:105)
 at rpc.Stub.call(Stub.java:109)
 at org.jinterop.winreg.smb.JIWinRegStub.winreg_OpenHKCR(JIWinRegStub.java:119)
 at org.jinterop.dcom.core.JIComServer.initialise(JIComServer.java:479)
 at org.jinterop.dcom.core.JIComServer.<init>(JIComServer.java:427)
 at org.jvnet.hudson.wmi.WMI.connect(WMI.java:41)
 at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:137)
 at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:184)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 at jcifs.smb.SmbTransport.connect(SmbTransport.java:296)
 at jcifs.smb.SmbTree.treeConnect(SmbTree.java:141)
 at jcifs.smb.SmbFile.doConnect(SmbFile.java:858)
 at jcifs.smb.SmbFile.connect(SmbFile.java:901)
 at jcifs.smb.SmbFile.connect0(SmbFile.java:827)
 at jcifs.smb.SmbFileInputStream.<init>(SmbFileInputStream.java:76)
 at jcifs.smb.SmbFileInputStream.<init>(SmbFileInputStream.java:65)
 at jcifs.smb.SmbFile.getInputStream(SmbFile.java:2784)
 at rpc.ncacn_np.RpcTransport.attach(RpcTransport.java:90)
 at rpc.Stub.attach(Stub.java:105)
 at rpc.Stub.call(Stub.java:109)
 at org.jinterop.winreg.smb.JIWinRegStub.winreg_OpenHKCR(JIWinRegStub.java:119)
 ... 10 more

Windows NAT blocking

This can occur whenever you've got a Jenkins server, and a newer version of Windows (e.g. 2008) on different network segments as a slave (has observed it on EC2). You'll get an error saying port 135 is unavailable even if you've opened it. There's a setting on the advanced tab of firewall rules, you must select "Allow Edge Traversal". NAT is not security damn it!

Windows registry related issues

Remote Communication Service

The Remote Communication Service "RemComSvc" must be running in order to launch commands remotely. If not started, Jenkins will try to (remotely) start this service, assuming it is well configured. If not, you may get an error like :

Checking if Java exists
ERROR: Failed to prepare Java
java.lang.reflect.UndeclaredThrowableException

In some cases (especially Windows 2008 R2) this can be caused by the lack of the Visual C++ runtime libraries needed by the service. If this is the case you will see an error in the windows event log similar to:

Activation context generation failed for "C:\Windows\RemComSvc.exe".
Dependent Assembly Microsoft.VC90.CRT,processorArchitecture="x86",publicKeyToken="1fc8b3b9a1e18e3b",type="win32",version="9.0.21022.8" could not be found.
Please use sxstrace.exe for detailed diagnosis.

To solve this issue install the Visual C++ 2008 x86 libraries.

Remote Registry Service

The Remote Registry service must be running in order to install the Jenkins service, but it may be stopped on your computer.  This is especially true for Windows Vista, where it is disabled by default.  If it is not running, you may get an error like this:

Message not found for errorCode: 0xC0000034
 org.jinterop.dcom.common.JIException: Message not found for errorCode: 0xC0000034
     at org.jinterop.winreg.smb.JIWinRegStub.winreg_OpenHKCR(JIWinRegStub.java:121)
     at org.jinterop.dcom.core.JIComServer.initialise(JIComServer.java:479)
     at org.jinterop.dcom.core.JIComServer.<init>(JIComServer.java:427)
     at org.jvnet.hudson.wmi.WMI.connect(WMI.java:41)
     at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:107)
     at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:178)
     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
     at java.util.concurrent.FutureTask.run(FutureTask.java:166)
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
     at java.lang.Thread.run(Thread.java:636)
 Caused by: jcifs.smb.SmbException: The system cannot find the file specified.
     at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:542)
     at jcifs.smb.SmbTransport.send(SmbTransport.java:644)
     at jcifs.smb.SmbSession.send(SmbSession.java:242)
     at jcifs.smb.SmbTree.send(SmbTree.java:111)
     at jcifs.smb.SmbFile.send(SmbFile.java:729)
     at jcifs.smb.SmbFile.open0(SmbFile.java:934)
     at jcifs.smb.SmbFile.open(SmbFile.java:951)
     at jcifs.smb.SmbFileOutputStream.<init>(SmbFileOutputStream.java:142)
     at jcifs.smb.TransactNamedPipeOutputStream.<init>(TransactNamedPipeOutputStream.java:32)
     at jcifs.smb.SmbNamedPipe.getNamedPipeOutputStream(SmbNamedPipe.java:187)
     at rpc.ncacn_np.RpcTransport.attach(RpcTransport.java:91)
     at rpc.Stub.attach(Stub.java:105)
     at rpc.Stub.call(Stub.java:109)
     at org.jinterop.winreg.smb.JIWinRegStub.winreg_OpenHKCR(JIWinRegStub.java:119)
     ... 10 more

If so, start the control panel, open "Administrative Tools," then "Services." Locate the Remote Registry service on the list, and click "Start this service."

Enable Remote Registry Access on Windows 7

By default Windows 7 (at least) will still deny remote access to the registry, even if the Remote Registry service is started. To test this, try to connect to your slave's registry via regedit on another machine. If you get a similar error ("Access is denied"), run powershell as an administrator on the slave, and execute Enable-PSRemoting. Reboot for good measure, and try launching the slave again.

Windows security related issues

Local Security Settings

  1. Start the control panel, go to "Administrative Tools", then "Local Security Policy". This will open up the "local security settings" window
  2. Go to "Local Policies" > "Security Options" > "Network access: Sharing and security model for local accounts." Change that to "Classic."
    This only applies to Windows computers that are not a part of a domain (reference)

WBEM Scripting Locator

On current Windows systems, Jenkins requires access to the "WBEM Scripting Locator". The following steps allow that:

  1. Launch 'regedit' (as Administrator)
  2. Find (Ctrl+F) the following registry key: "{76A64158-CB41-11D1-8B02-00600806D9B6}" (it's in HKEY_CLASSES_ROOT\CLSID)
  3. Right click and select 'Permissions'
  4. Change owner to administrators group (Advanced...).
  5. Change permissions for administrators group. Grant Full Control.
  6. Change owner back to TrustedInstaller (user is "NT Service\TrustedInstaller" on local machine)
  7. Restart Remote Registry Service (Administrative Tools / Services)

Credit to Florian Vogle on the Hudson wiki.

Access is denied error

When you get an

"Access is denied. [0x00000005]"
error, apply the following patch to the registry:

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System
  • create or modify 32-bit DWORD: LocalAccountTokenFilterPolicy
  • set the value to: 1

Credit to Arturas Sirvinskas (comments below)

Remote Agent - Windows returned error code 0x8001ffff

On Windows 2008 R2 (64bits), if you see a message like this:

ERROR: Message not found for errorCode: 0x8001FFFF
org.jinterop.dcom.common.JIException: Message not found for errorCode: 0x8001FFFF
 at org.jinterop.dcom.core.JIComServer.init(JIComServer.java:546)
 at org.jinterop.dcom.core.JIComServer.initialise(JIComServer.java:458)
 at org.jinterop.dcom.core.JIComServer.<init>(JIComServer.java:427)
 at org.jvnet.hudson.wmi.WMI.connect(WMI.java:59)
 at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:218)
 at org.jenkinsci.plugins.vSphereCloudLauncher.launch(vSphereCloudLauncher.java:198)
 at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:204)
 at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
 at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketTimeoutException
 at sun.nio.ch.SocketAdaptor$SocketInputStream.read(Unknown Source)
 at sun.nio.ch.ChannelInputStream.read(Unknown Source)
 at org.jinterop.dcom.transport.JIComTransport.receive(JIComTransport.java:146)
 at rpc.DefaultConnection.receiveFragment(DefaultConnection.java:182)
 at rpc.DefaultConnection.receive(DefaultConnection.java:68)
 at rpc.ConnectionOrientedEndpoint.receive(ConnectionOrientedEndpoint.java:227)
 at rpc.ConnectionOrientedEndpoint.bind(ConnectionOrientedEndpoint.java:181)
 at rpc.ConnectionOrientedEndpoint.rebind(ConnectionOrientedEndpoint.java:153)
 at org.jinterop.dcom.transport.JIComEndpoint.rebindEndPoint(JIComEndpoint.java:40)
 at org.jinterop.dcom.core.JIComServer.init(JIComServer.java:535)
 ... 11 more

To resolve these issues, you may need to disable NTLMv2 authentication.
To turn off NTLMv2 authentication:

  1. Run regedit to edit the registry.
  2. Locate the following registry key: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Lsa.
  3. Locate the value named LMCompatibilityLevel, and change the DWORD value to 2 (send NTLM authentication only).
  4. Close regedit and restart the machine.

Taken from https://support.quest.com/SolutionDetail.aspx?id=SOL86281

Windows installation related issues

Configuration of the slave if jenkins master server changed address

Make sure to check out logs of the starting service.
If you are changing configuration of the jenkins it is possible that slave tries to connect to old masters address.
To fix this:

  1. on the slave: stop jenkins windows service (if not already dead)
  2. on the master: go to Jenkins > Manage Jenkins > Configure System, and copy value from 'Jenkins URL' parameter
  3. on the slave: edit jenkins-slave.xml and alter there service/arguments section to reflect new url of the server, copied in the previous step, save config
  4. on the slave: start jenkins service and check logs if anything else fails.

.NET Framework

On Windows XP / 2003, if you see a message like this:

Installing the Hudson slave service
No more data is available. [0x00000103]
org.jinterop.dcom.common.JIException: No more data is available. [0x00000103]
	at org.jinterop.winreg.smb.JIWinRegStub.winreg_EnumKey(JIWinRegStub.java:390)
	at hudson.util.jna.DotNet.isInstalled(DotNet.java:81)
	at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:117)
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:180)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:636)
Caused by: org.jinterop.dcom.common.JIRuntimeException: No more data is available. [0x00000103]
	at org.jinterop.winreg.IJIWinReg$enumKey.read(IJIWinReg.java:762)
	at ndr.NdrObject.decode(NdrObject.java:19)
	at rpc.ConnectionOrientedEndpoint.call(ConnectionOrientedEndpoint.java:138)
	at rpc.Stub.call(Stub.java:112)
	at org.jinterop.winreg.smb.JIWinRegStub.winreg_EnumKey(JIWinRegStub.java:386)
	... 8 more

Then try upgrading .NET framework to ver 3.5SP1.

Taken from http://n4.nabble.com/exception-when-winxp-slaves-launch-No-more-data-is-available-0x00000103-td386006.html

Windows 64bit installation related issues

See page comments below for various tips on using Windows 64bit slave (Windows 7 or Server 2008).

Make sure java bin directory is in your system path, e.g. \Program Files (x86)\Java\jre6\bin or \Windows\SYSWOW64

WARNING: You must have the path to the JRE that is installed in \Windows\SYSWOW64.  For example, when my system updated to Java7 I had to update the PATH to point at the new JRE or starting the slave just silently failed.

If this bugs you, then upvote https://issues.jenkins-ci.org/browse/JENKINS-16061 and https://issues.jenkins-ci.org/browse/JENKINS-14559

Windows 2008 R2 (64bit)

This is an attempt to describe what I had to do on a clean Windows 2008 R2 (64bit) install to get it to work:

  1. Turned off the firewall (this could be configured correctly to be safer, but I didn't care since its in a firewalled "safe" part of the net)
  2. Installed the Visual C++ Redist
  3. Changed the permissions on the TrustedInstaller registry key (see above).
  4. Added the Java "/bin" directory to "PATH"

Windows Server 2012 (64bit)

To connect to Windows Server 2012, Change Permission for following registry key to Full Control:

- HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Wow6432Node\CLSID{72C24DD5-D70A-438B-8A42-98424B88AFB8}

- HKEY_CLASSES_ROOT\CLSID{76A64158-CB41-11D1-8B02-00600806D9B6}

  1. Launch 'regedit' (as Administrator)
  2. Find (Ctrl+F) the following registry key: "{72C24DD5-D70A-438B-8A42-98424B88AFB8}" in HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Wow6432Node\CLSID\
  3. Right click and select 'Permissions'
  4. Change owner to administrators group (Advanced...).
  5. Change permissions for administrators group. Grant Full Control.
  6. Change owner back to TrustedInstaller (user is "NT Service\TrustedInstaller" on local machine)
  7. Repeat the steps 1-6 for HKEY_CLASSES_ROOT\CLSID{76A64158-CB41-11D1-8B02-00600806D9B6}
  8. Restart Remote Registry Service (Administrative Tools / Services)

If all else fails...

If you have KB2661256 installed, you can refer to this issue https://issues.jenkins-ci.org/browse/JENKINS-15596
Please file an issue about the problem with the stack trace, with information like Windows versions, so that we can take a look.

What is it actually trying to do?

This section goes into the details of how the managed Windows slave launcher actually works.

This launcher uses several protocols that has been around for a quite some time.

  • It first uses CIFS (also known as "Windows file share protocol") to push files into the slave. When used by someone with administrative priviledges, Windows file shares expose what's commonly known as "administrative shares", which are hidden exported directories that cover every drive in the system.
  • It then uses DCOM to talk to WMI to install and start a service remotely.
  • Jenkins uses two services, one is called Remote Communication Service and this provides a general-purpose remote command execution capability. Jenkins uses this to check if Java is available and if not install it. A failure to do this is not fatal problem, as Jenkins will proceed by assuming that Java is available in a reasonable place. This service is destroyed after it is used so as not to cause any harm to the security. The communication between Jenkins master and this service happens over a named pipe, which itself is protected by access control.
  • Jenkins then installs the actual slave as a Windows service, by using the WMI over DCOM, then it starts this service.

Labels

Edit
windows windows Delete
slave slave Delete
howto howto Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Add Comment