Beyond code: A Comprehensive Study on Website Builders, Their Limitations, and Opportunities for Innovation

1Nanyang Technological University

Abstract

The rise of website builders has simplified online presence creation for individuals and businesses. However, existing platforms face limitations which hinder broader adoption. This study investigates the current landscape of website builders through case studies and user surveys, uncovering common limitations such as customisability, inaccurate documentation and lack of versioning tools. The study further explores innovative opportunities to tackle them from these insights, such as a Large Language Model (LLM) powered frontend UI generation functionality, a Plugins-of-Plugins system, an HTML importation system, and an automated documentation generation feature. This study then successfully proves these features' technical feasibility through experiments and implementation by building a proof-of-concept website builder. Additionally, through experiments, it is found that passing the partial design or sketch is a valid approach to make minor modifications to the current HTML codebase, and commercial models outperform open-source models in this task.

Introduction

Motivation

Despite the easy-to-use features of website builders and the large population adopting website builders, the user experience of building a website on such platforms is not necessarily good.

Methodology

We attempt to understand the current landscape of website builders. We achieve this by conducting a two-phase approach to identify the limitations and opportunities for innovation in website builders.

For the 1st phase, we aim to understand the limitations of existing website builders by conducting a case study on existing website builders and a user survey to gather insights from users.

For the 2nd phase, we propose a set of features that can address the limitations identified in the first phase. To ascertain the technical feasibility of these features, we conduct experiments and implementation to prove that these features can be implemented in a website builder.

Methodology

Limitations of Website Builders

We found lack of customisability, inaccurate documentation, and poor versioning tools to be the most common limitations in website builders. These findings remain consistent across case studies and user surveys.

Limitations

Case Studies

We help FOGG, a company that designs learning programmes for children, to build a website for their business need. We take over their existing websites built on Wix, and build on top of it to identify the limitations of existing website builders. We also conduct a case study on website builders such as WordPress and 10Web AI Builder to obtain a deeper understanding of website builders' features and limitations.

The followings are the limitations we found in the case studies:

User Surveys

We collected 21 valid responses from the user survey after filtering out those that did not meet the criteria (e.g., irrelevant responses, responses finishing too quickly).
The survey results show that the most common limitations in website builders are lack of customisability, poor editing experience, inaccurate documentation, and poor versioning tools. These findings are consistent with the case studies conducted.

Survey Results on Limitations Survey Results on Limitations (Open-Ended)

Proposed Solutions

  1. Lack of Customisability
    • Frontend UI Generation: We propose a feature that allows users to generate frontend UIs from full or partial designs and sketches easily. This feature is implemented using a Large Language Model (LLM) powered frontend UI generation functionality.
    • Plugins-of-Plugins System: We propose a system that allows users to create plugins that can be used by other plugins, thus allowing for more customisability.
    • HTML Importation System: We propose a system that allows users to import HTML code into the website builder, thus allowing users to use existing HTML code in the website builder.
  2. Inaccurate Documentation
    • Automated Documentation Generation: We propose a feature that automatically generates documentation for the website builder, thus ensuring that the documentation is always up-to-date and accurate.
  3. Poor Versioning Tools
    • Git-powered Version Control: We propose a system that integrates Git for version control, allowing users to track changes, revert to previous versions, and collaborate more effectively.
Limitations

Experiment: Partial Frontend UI Generation

Motivation

During our case study, we saw a pattern where the business owner modified the design of some small components from the layout the external contractors designed. We argue that partial frontend UI generation is valuable, especially for non-technical people, such as SME owners or product managers, to make minor changes without having to worry about the underlying code. This helps with customisability in website builders.

Motivation

Data Preparation

High-fidelity Design

We duplicated the 25 web pages into a dataset of 50 web pages. For each web page, we perform the following steps:
  1. We randomly chose one group of components we wished to remove afterwards, which must be next to each other.
  2. We took a full screenshot of the original web page and extracted the screenshot of those components.
  3. Afterwards, we manually removed that one chosen group of components from the HTML code and replaced it with a <missing> tag.
  4. We then applied uncss, a popular JavaScript library for minifying CSS, to remove all the unused CSS from the codebase.
We refer to this web page with the <missing> tag as “Partial HTML” in the later sections. High-fidelity Design Preparation

Sketch

We perform the similar steps for the sketches, except that:
  1. We manually drew the sketches based on the one selected component group.
  2. We also overlapped our sketch to the selected component group in the screenshot for prompting purpose later in the experiments.
High-fidelity Sketch Preparation

Prompting Methods

We evaluate the VLLMs using four different methods:
  1. Merging
  2. We remove the <missing> label from partial HTML and feed it together with a merged screenshot of the new design/sketch. The VLLM identifies and inserts or modifies the code as needed.
  3. Merging with Hint
  4. We keep the <missing> tag in the partial HTML as a hint, and ask the VLLM to replace it using the merged screenshot of the new design/sketch.
  5. Fixed-sized with Hint
  6. We provide partial HTML with the <missing> tag and the design/sketch of the removed component group rendered at a fixed width of 1280 pixels, based on previous studies and tool defaults.
  7. Varied-sized with Hint
  8. We provide partial HTML with the <missing> tag and the design/sketch of the removed component group in varied sizes. This accounts for users preparing designs/sketches without fixed widths, ensuring readability and responsiveness even if sketches differ slightly from the original layout.
Prompting Methods 1 Prompting Methods 2

Research Questions

  1. Which prompting method consistently produces the most accurate code?
    • Open-source models:
      • Merging
      • Overall lower score
    • Commercial models:
      • Fixed-Sized with Hint & Varied-Sized with Hint
      • Gemini 2.0 Flash and GPT-4o are top performers
    Heatmap of Results by Prompting Methods
  2. How consistent are the VLLMs in the tasks?
    • Open source models:
      • 100% success rate other than Gemma2:27b
      • Mostly generate in one go, sometimes need multiple iterations
    • Commercial models:
      • 100% success rate
      • Generate responses in one go
  3. How responsive is the code generated by VLLMs? Responsiveness of Code Generated by VLLMs
    • Overall low inter-viewport variance
    • Ranging from 0.01 to 0.02, most less than 0.01
    • Generated websites remains responsive, as HTMLs in our test dataset are all responsive
    Inter-Viewport Variance of Generated Code
  4. Does the fidelity of the input affect VLLM's performance?
    Instead of using the design, we use the sketch dataset to evaluate the VLLMs' performance. We found the trend to be similar to the high-fidelity design dataset, as shown in the results below:
    • Open-source models:
      • Merging & Merging with Hint
      • Overall lower score
    • Commercial models:
      • Varied-Sized with Hint
      • High score
      • Gemini-2.0 Flash and GPT-4o remains top performers
    Heatmap of Results with Sketch By Prompt

Implementation

By extending GrapesJS, an open-source website builder, we create a website builder with our proposed solutions:
  1. Frontend UI Generation
  2. Plugins-of-Plugins Systems
  3. HTML Importation System
  4. Automated Documentation Generation
We chose GrapesJS as it is: The video demos of the implementation can be found here.

BibTeX

@misc{10356_184126,
  author    = {Boon Hian Lim},
  title     = {Beyond code: a comprehensive study on website builders, their limitations, and opportunities for innovation},
  year      = {2025},
}