After using facts to designated classes and data to nodes for a number of years I recently released this could actually be a major security issue. In order to understand the problem we must firstly cover a few thing related to how facter works.
Please see update
1) Facts are sent to a puppet master at the beginning of a puppet run from the puppet agent. After this they are available to the dsl at top-level scope.
Thusly I can do
1 2 3 4
$::hostname is the fact hostname from the agent.
2) The puppet agent sends the facter variables to the master and we have no way of actually validating these variables. This means that the agent can send any facter variables.
If we trust the facts sent from an agent and use this to assign data or classes to a node we can have security issues. For example if we have a hostname of 1.app.dev.dc1.wibble.com, and created custom facts that extract useful data from this hostname and provide it to the puppet dsl as top level variables as below:
1 2 3
Each of the variables above would be the value of a custom fact. Then these variables can be used within there puppet manifest either in the context of a hiera hierarchy or a logic to include classes as in the example below.
Hiera is setup as below:
1 2 3 4 5 6 7 8 9
With a site.pp as below:
1 2 3 4
This will include any classes from hiera that match the lookup classes from any of the data sources that hiera uses. In this case it will include any classes it finds in dev.yml, dc1.yml, app.yml, this is exactly the behavior that we want and expect.
What happens if an attacker compromises the above machine which is connected to a puppet master and it uses the above systems to designate classes to a node. They then run :
So what happens then well the top-level variables are now:
1 2 3
Hiera then uses the variables to include classes as before but this time it includes classes from dev.yml, dc1.yml, db.yml. Puppets then starts apply the new classes in db.yml to the machine. These classes could include the root password for the dev database as well as other sensitive information. This would allowing the attacker to find out more information about our infrastructure and the configuration of machines out on our network.
As you can see, by trusting the facts that are sent to the puppet master from the puppet agent we are placing the actually classification logic on the puppet agent and not the puppet master. Meaning that any machine in our infrastructure can classify itself as any other type of machine in our infrastructure.
This is not only related to problems with including classes on nodes but can be issues with data being included on the incorrect node. For example if we changed ourenv to prod rather than dev, we now have hiera lookup all the production data and apply this to our node rather than dev data.
In order to protect against this we must build manifest that simply do not trust any information sent from the client other than what we can actually confirm. The only piece of information we can actually confirm is the certname this is due to the fact that if a clients changes the certname, then the node must go through the ssl signing process again.
In the example above I would not use custom facts but rather a custom function to ‘chomp’ the certname and set top-level variables. As we are using the certname and the logic from our custom function which is run on the master we can be assured that the variables being set can’t be tampered with on the client side.
1 2 3 4 5 6 7
The classify function above would simply use a regex to extract the information needed from the certname and return it to the top level variable. As these top level variable override anything set by facter we can then simple use these as before but be sure that the agent can’t affect what classes / data is assigned to it from the puppet master.