Erik Andersen had never been interested in architecture, but playing Minecraft – a popular 3D video game where users build and navigate their own digital environments – he found himself constructing a brick-by-brick scale model of a temple he’d once seen in Bangkok.

“Suddenly I was actually starting to look at buildings and think about their design features,” said Andersen, assistant professor of computer science. “I’m generally interested in education in games, and I was impressed by how this game got me interested in architecture.”

Seeking to harness that educational potential, Andersen and three other Cornell computer scientists developed a Minecraft modification that uses artificial intelligence to tell players whether their buildings fit into certain architectural styles, and offers ideas for how the structures could be improved. Their modification – which is not yet publicly available – also helped advance the researchers’ work in computer vision and human-robot communication.

The authors will present their paper, “Design Mining for Minecraft Architecture,” at the Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence and Interactive Digital Entertainment, Nov. 13-17, in Edmonton, Alberta, Canada.

“Minecraft, without intending to be an educational tool, has been pulled in so many different directions, where originally it was just for fighting monsters,” said senior author Ross Knepper, assistant professor of computer science. “One of the things that’s important to learn when you’re a kid and throughout life is creativity, abstraction – how to envision what you want and then create it. That’s not an easy skill for anyone. So this is a tool that helps people not get discouraged, maybe if they’re beginning at Minecraft and don’t know how to use their imagination right off the bat.”

“When you’re working with images, it’s really hard to actually get at the essence of what something is. A machine observing how people build can actually learn quite a bit about about what shape is, what structure is, what buildings are.”

Bharath Hariharan, assistant professor of computer science
Based on buildings that Minecraft players created and uploaded for others to use, the researchers created a deep neural network – a kind of machine learning trained to predict whether data belongs in a certain category. Through that network, players could learn whether their building is medieval, modern, Asian or classical – four especially popular tags used by Minecraft players. Once the building is classified, another algorithm can show the users similar buildings to inspire them to make improvements to their own.

The program also allows users to import similar buildings into their Minecraft worlds, creating a neighborhood of like styles.

“It’s a way for the users to learn more about the thing they built,” said Irene (Euisun) Yoon ’19, first author on the paper. “People are really interested in having more design spaces in Minecraft, and being able to build certain types of architecture, but there weren’t any design tools as far as we were aware that can teach them.”

Yoon personally curated the data set to make sure the buildings were labeled correctly, since their algorithm was less accurate than they would have liked because it was trained with fewer than 1,000 player-created buildings. Ideally, such an algorithm would be trained with tens or hundreds of thousands of pieces of data.

“If you ask an architect to tell you what a building’s style is, the architect will say, ‘OK, it’s one-and-a-half stories, it has dormers, it’s a Cape Cod.’ Deep learning is doing that but it’s doing it in a black box way (hidden from view). It learns patterns, but not necessarily the same patterns an architect would say are the key things,” Knepper said. For example, if all the modern-style houses in a data set have pools on the roof, the computer could assume that rooftop pools are a requirement for modern houses.

For Knepper, a roboticist by training, the Minecraft project helped answer questions about how a robot might follow a human’s instructions.

“If I say, ‘Build a house,’ today a robot is going to say, ‘I don’t know what that means.’ ‘Which brick should I put where?’ is the level at which robots need instruction,” Knepper said. “We’d like humans to be able to interface with robots more like we interface with each other. So if I tell it to build a medieval house or an ancient house and give some of the high-level details, it would know at that point how to turn it into a plausible thing that does everything you want. We’re not there yet, but this is the first step towards that goal.”

Co-author Bharath Hariharan, assistant professor of computer science, approached the research from the perspective of his own work in computer vision. In trying to interpret an image, a computer can be trained to pick up cues such as shape and solidity, but may have trouble processing perspective or scale. Using people’s intelligence through their Minecraft structures and tags can help computers learn to solve those problems.

“When you’re working with images, it’s really hard to actually get at the essence of what something is,” Hariharan said. “A machine observing how people build can actually learn quite a bit about about what shape is, what structure is, what buildings are.”

The paper is based on work supported by the National Science Foundation.